Course project: Time Series Data Cleaning
Imagine this: you’re given a script by a colleague and your task is to make usable for others in your organization; to make a proper package with a good structure, tests and documentation. And with a design that will make it easy to extend and maintain in the future.
In this project, the script removes bad data from three different time series using three different algorithms: out-of-range, spikes, and flat-periods. Your colleague is not the best Python coder, so you will start by cleaning up the code, using functions and gradually from there improve the quality.
Module 1: GitHub and basic functions
- 1.1 GitHub repo
- 1.2 Functions
Module 2: Modules and classes
- 2.1 Function arguments
- 2.2 Modules
- 2.3 Classes
Module 3: Installable package and pytest
- 3.1 Installable package
- 3.2 Pytest
Module 4: GitHub actions and auto-formatting
- 4.1 Github Action
- 4.2 Linting with ruff
- 4.3 Formatting with ruff
- 4.4 pyproject.toml
Module 5: Object-oriented design
- 5.1 Type Hints
- 5.2 Data class
- 5.3 Module level function
- 5.4 Composition or inheritance
Module 6: Documentation
- 6.1 README
- 6.2 Docstrings
- 6.3 mkdocs
Module 7: Publishing
- 7.1 License
- 7.2 Publishing