Dependencies and Continuous Integration

Application

A program that is run by a user

  • command line tool
  • script
  • web application

Pin versions to ensure reproducibility, e.g. numpy==1.11.0

Library

A program that is used by another program

  • Python package
  • Low level library (C, Fortran, Rust, …)

Make the requirements as loose as possible, e.g. numpy>=1.11.0

Make the requirements loose, to avoid conflicts with other packages.

Dependency management

uv is the recommended tool for managing a Python project including dependencies.

Example of pinning versions:

pyproject.toml
dependencies = [
  "numpy==1.11.0",
  "scipy==0.17.0",
  "matplotlib==1.5.1",
]

. . .

Or using a range of versions:

pyproject.toml
dependencies = [
  "numpy>=1.11.0",
  "scipy>=0.17.0",
  "matplotlib>=1.5.1,<=2.0.0"
]

. . .

Install dependencies:

$ uv sync

Development dependencies

pyproject.toml
[dependency-groups]
dev = [
    "pytest>=8.4.1",
]

Creating an installable package

With uv the the package will be installed in editable mode when you create it with:

$ uv init --lib

Start a Python session:

$ uv run python
>>> import mini
>>> mini.foo()
42

. . .

Run tests:

$ uv run pytest
...

tests/test_foo.py .                       [100%]

=============== 1 passed in 0.01s ===============

Virtual environments

  • Creates a clean environment for each project
  • Allows different versions of a package to coexist on your machine
  • Can be used to create a reproducible environment for a project
  • Virtual environments are managed by uv

Continuous Integration

Running tests on every commit in a well defined environment ensures that the code is working as expected.

It solves the “it works on my machine” problem.

Executing code on a remote server is a good way to ensure that the code is working as expected.

There are many CI services available, e.g.:

  • GitHub Actions
  • Azure Pipelines
  • Travis CI
  • Circle CI

GitHub Actions was forked from Azure Pipelines and runs on the same type of infrastructure, thus are very similar technologies.

GitHub Actions

  • Workflow are stored in the .github/workflows folder.
  • Workflow is described in a YAML file.
  • YAML is whitespace sensitive (like Python).
  • YAML can contain lists, dictionaries and strings, and can be nested.
$ tree mikeio/.github/
mikeio/.github/
└── workflows
    ├── docs.yml
    ├── downstream_test.yml
    ├── full_test.yml
    ├── notebooks_test.yml
    ├── perf_test.yml
    ├── python-publish.yml
    └── quick_test.yml

name: Quick test

on: # when to run the workflow
  push:
    branches: [ main]
  pull_request:
    branches: [ main ]

jobs: # what to run
  build:
    runs-on: ubuntu-latest # on what operating system

    steps:
    - uses: actions/checkout@v3
    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: 3.9
    - name: Install dependencies
      run: |
        uv sync

    - name: Test with pytest
      run: |
        uv run pytest

🙂🚀


☹️

Benefits of CI

  • Run tests on every commit
  • Test on different operating systems
  • Test on different Python versions
  • Create API documentation (next week)
  • Publish package to PyPI or similar package repository (two weeks from now)

Triggers

  • push and pull_request are the most common triggers
  • schedule can be used to run the workflow on a schedule
  • workflow_dispatch can be used to trigger the workflow manually
on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]
  schedule:
    - cron: '0 0 * * 0'
  workflow_dispatch:

Jobs

  • Operating system
  • Python version
...
jobs:
  build:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-latest, windows-latest]
        python-version: [3.8, 3.9, "3.10","3.11"]
...

GitHub Releases

  • GitHub releases are a way to publish software releases.

  • You can upload files, write release notes and tag the release.

  • As a minimum, the release will contain the source code at the time of the release.

  • Creating a release can trigger other workflows, e.g. publishing a package to PyPI.

https://github.com/pydata/xarray/releases/tag/v2022.12.0

Summary

  • Application vs library
  • Use a separate virtual environment for each project
  • Use GitHub Actions to run tests on every commit
  • Use GitHub Releases to publish software releases