19. Code Quality and CI#
For more information, please see the Scientific Python Development Guidelines, which covers this in much more detail!
19.1. Pre-commit#
One of my favorite tools is pre-commit. It allows you to drive almost any “fixer” or “linter” available, all from one place. It handles environments and caching and even updates for you.
To configure, add a .pre-commit-config.yaml
file like this:
repos:
# Some "common" checks useful for almost any repo
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: check-added-large-files
- id: check-case-conflict
- id: check-merge-conflict
- id: check-symlinks
- id: check-yaml
- id: debug-statements
- id: end-of-file-fixer
- id: mixed-line-ending
- id: requirements-txt-fixer
- id: trailing-whitespace
# Automatically format Python code
- repo: https://github.com/psf/black
rev: "23.7.0"
hooks:
- id: black
The file has a list of repos (local checks can be written too). Each repo contains pre-commit hooks that you can run and configure. You should put modifying “fixer” checks before the “linter” checks, just in case they fix something that then gets linted.
You can install pre-commit from brew
(macOS), or via pipx
/pip
for anything with Python.
You can then run it like this:
pre-commit run -a
That will check everything. You don’t need to know anything about how to run the checkers or linters, it’s a single standard interface for all projects. Each hook gets a unique, cached environment, so the next time you run it, it’s lightning fast. If you leave off the -a
, it only checks the changed files in your staging area, even partially staged ones!.
If you want to update to the latest versions of all your hooks, run:
pre-commit autoupdate
If you want to use it in the namesake “pre-commit” mode, then run:
pre-commit install
Now it runs before every commit, and you’ll never check in “bad” code again! Use -n
to skip the pre-commit check when committing for emergencies.
One of the recent exciting advancements is https://pre-commit.ci, where you can just add your repo to the GitHub list, and then you get your PRs autocorrected and checked, and you get weekly update PRs to your config!
PS: This is generally not used for pytest (though it could be), since tests are generally slower and take more setup, including being installed properly.
19.2. CI: GitHub Actions#
One of the most important aspect of good code is Continuous Integration (CI); every change should be tested and ideally not allowed to be merged unless it passes. If you support multiple versions of Python or OSs, you should test on each of them. CI lets you do this, and other contributors get the benefit too; if you have good tests you can feel comfortable about making and accepting changes.
There are many services, but the most popular and possibly one of the best designed ones is GitHub Actions. It is really easy to setup, doesn’t require extra permissions or accounts, and runs 10(!) parallel jobs, and supports all three OSs, often with the same code, and is highly modular. This is what a simple job would look like:
.github/workflows/ci.yml
:
name: Python tests
on: [push]
jobs:
build:
runs-on: ${{ matrix.runs-on }}
strategy:
matrix:
python-version: ["3.8", "3.11"]
runs-on: [ubuntu-latest, macos-latest, windows-latest]
steps:
- uses: actions/checkout@v3
- name: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install with dev requirements
run: pip install .[dev]
- name: Test with pytest
run: pytest
That’s it, 6 jobs run and test your code!
Official docs are good
Scientific Python Development Guidelines - GHA has some good help, too!
CI is not just for tests! You can use it for deploying static websites, for building binaries for distribution, for making releases, for compiling documents or documentation, for monitoring things at a regular interval, for automating tasks and making PRs, and much, much more! You can have as many workflows or jobs in a workflow as you want (up to 255).
If you are on GitLab, GitLab CI is excellent too, just not quite as modular.
The website for this course is built in GHA. Can you find the config for it?