The data science team in the Government Digital Service (GDS) has created a tool to generate project structure for relevant initiatives.
Named govcookiecutter, it is freely available to public sector data scientists, and is aimed at making it quicker to bring colleagues into a project with a consistent structure.
Eric Young, data scientist at GDS, said in a blogpost that it is based on assumptions that Git version control - a system that records changes to a file or set of files over time to enable recall of specific versions later – is being used with either GitHub or GitLab. Users also need access to Python programming language and a Unix based machine, although most features will work on Windows.
The tool provides a series of prompts that help to generate the structure with a range of AQA (analytical quality assurance) features.
Hooks
Other features include a hook to check if committing files larger than 5MB, another to clean up Jupyter notebook outputs, and another to identify secrets such as credentials and API tokens and prevent them being version controlled.
Young said the GDS team is open to contributions on the project through the GitHub repository and would like to incorporate frameworks from other government departments.
Image from GOV.UK, Open Government Licence v3.0