19. Code Quality and CI#

For more information, please see the Scientific Python Development Guidelines, which covers this in much more detail!

19.1. Pre-commit#

One of my favorite tools is pre-commit. It allows you to drive almost any “fixer” or “linter” available, all from one place. It handles environments and caching and even updates for you.

To configure, add a .pre-commit-config.yaml file like this:

repos:
# Some "common" checks useful for almost any repo
- repo: https://github.com/pre-commit/pre-commit-hooks
  rev: v4.4.0
  hooks:
  - id: check-added-large-files
  - id: check-case-conflict
  - id: check-merge-conflict
  - id: check-symlinks
  - id: check-yaml
  - id: debug-statements
  - id: end-of-file-fixer
  - id: mixed-line-ending
  - id: requirements-txt-fixer
  - id: trailing-whitespace

# Automatically format Python code
- repo: https://github.com/psf/black
  rev: "23.7.0"
  hooks:
  - id: black

The file has a list of repos (local checks can be written too). Each repo contains pre-commit hooks that you can run and configure. You should put modifying “fixer” checks before the “linter” checks, just in case they fix something that then gets linted.

You can install pre-commit from brew (macOS), or via pipx/pip for anything with Python.

You can then run it like this:

pre-commit run -a

That will check everything. You don’t need to know anything about how to run the checkers or linters, it’s a single standard interface for all projects. Each hook gets a unique, cached environment, so the next time you run it, it’s lightning fast. If you leave off the -a, it only checks the changed files in your staging area, even partially staged ones!.

If you want to update to the latest versions of all your hooks, run:

pre-commit autoupdate

If you want to use it in the namesake “pre-commit” mode, then run:

pre-commit install

Now it runs before every commit, and you’ll never check in “bad” code again! Use -n to skip the pre-commit check when committing for emergencies.

One of the recent exciting advancements is https://pre-commit.ci, where you can just add your repo to the GitHub list, and then you get your PRs autocorrected and checked, and you get weekly update PRs to your config!

PS: This is generally not used for pytest (though it could be), since tests are generally slower and take more setup, including being installed properly.

19.2. CI: GitHub Actions#

One of the most important aspect of good code is Continuous Integration (CI); every change should be tested and ideally not allowed to be merged unless it passes. If you support multiple versions of Python or OSs, you should test on each of them. CI lets you do this, and other contributors get the benefit too; if you have good tests you can feel comfortable about making and accepting changes.

There are many services, but the most popular and possibly one of the best designed ones is GitHub Actions. It is really easy to setup, doesn’t require extra permissions or accounts, and runs 10(!) parallel jobs, and supports all three OSs, often with the same code, and is highly modular. This is what a simple job would look like:

.github/workflows/ci.yml:

name: Python tests

on: [push]

jobs:
  build:
    runs-on: ${{ matrix.runs-on }}
    strategy:
      matrix:
        python-version: ["3.8", "3.11"]
        runs-on: [ubuntu-latest, macos-latest, windows-latest]

    steps:
    - uses: actions/checkout@v3
    
    - name: actions/setup-python@v4
      with:
        python-version: ${{ matrix.python-version }}
        
    - name: Install with dev requirements
      run: pip install .[dev]

    - name: Test with pytest
      run: pytest

That’s it, 6 jobs run and test your code!

CI is not just for tests! You can use it for deploying static websites, for building binaries for distribution, for making releases, for compiling documents or documentation, for monitoring things at a regular interval, for automating tasks and making PRs, and much, much more! You can have as many workflows or jobs in a workflow as you want (up to 255).

If you are on GitLab, GitLab CI is excellent too, just not quite as modular.

The website for this course is built in GHA. Can you find the config for it?

19.3. Quick package construction: Cookiecutter#

If you want to set up a project fast, you can use cookiecutter, a Python application (as usual, install with brew on macOS or pipx/pip elsewhere). Then you can stamp out a new package based on online templates. If you are following the Scientific Python Development Guidelines, then run:

cookiecutter gh:scientific-python/cookie

Answer a few questions, then you get a working package with CI and strict style checking in one of around a dozen backends, including C++ pybind11 with cibuildwheel wheels!