19  Package Maintenance and Automation

Author
Affiliation

Dr Randy Johnson

Hood College

Published

April 16, 2026

Acknowlegements

Gemini code assist was active during the preparation of these notes, and some autosuggestions were incorporated into the final text.

Principles of Package Maintenance & Sharing

  • Sharing code isn’t just about giving away software
    • Advancing science
    • Scientific reproducibility
    • Building a resilient community
  • Good maintenance spreads knowledge and lowers the entry barrier

The Bus Factor

  • How many team members need to be hit by a bus (or win the lottery and quit) before the project completely stalls?
  • A bus factor of 1 is dangerous

Components of a Good Repository

  • README.md
    • Audience: new users
    • Project title and description
    • Installation steps
    • Basic usage examples
  • CONTRIBUTING.md
    • Audience: Advanced users who are interested in contributing
    • Explain how to set up the dev environment locally
    • Code style guidelines
    • PR process
  • LICENSE
    • Without a license, code is completely copyrighted by default and technically illegal to use
    • MIT: do whatever, don’t sue me
    • GPL: if you modify and distribute, you must share your source
  • CHANGELOG.md
    • Audience: existing users
    • Don’t rely purely on commit history
    • Write human-readable logs organized by Added, Changed, Deprecated, Removed, Fixed, and Security

Versioning

  • MAJOR.MINOR.PATCH (e.g. v2.4.1)

  • PATCH (2.4.1 -> 2.4.2)

    • Bug fixes
    • If users update, nothing breaks
  • MINOR (2.4.1 -> 2.5.0)
    • New features
    • Fully backward compatible
  • MAJOR (2.4.1 -> 3.0.0)
    • Breaking changes
    • Users will need to update their own code to use this new version

GitHub Tools for Maintenance

  • Issues
  • Pull Requests (PR)
  • Releases / Packages

Issue Tracking & Management

  • Writing good bug reports
    • Reproduction steps
    • Expected behavior
    • Actual Behavior
    • “It’s broken” is not a good report issue
  • Labels & Milestones
    • Issue labels (good first issue for attracting beginners)
    • Milestones group issues together for a specific target release (e.g. “Version 2.0 Launch”)
    • Issue templates can be a useful tool for getting more helpful bug reports from users

Example issue template on GiHub

Pull Requests (PR)

  • For more than contributing to open source projects

  • Branching

    • Don’t push directly to main
    • Example branches: fix/login-bug or feature/dark-mode
  • Code Reviews
    • Code review is a conversation, not an attack
  • Automation
    • Example: writing Closes #42 in a PR description automatically closes Issue #42 when the PR is merged

Releases & Packages

  • Release
    • A GitHub Release is a wrapper around a Git Tag
    • It allows you to attach compiled binaries or release notes
  • Package
    • Source code (GitHub repo) is different from an installable package (e.g. PYPI, NPM)
    • GitHub Packages can act as a private registry (e.g. to use with npm)

Automating Maintenance with GitHub Actions

  • What is CI/CD?
  • Common maintenance workflows

Continuous Integration

  • Automatically running tests and linters every time code is pushed
  • Tests are included for each feature
  • Boundary conditions are covered
  • Each time a bug is fixed, add a new test to make sure it doesn’t come up again
  • “Does this code break my package?”

Continuous Deployment

  • Automatically publishing or deploying the code once CI passes
  • Minor releases are frequent

GitHub Actions

Automation of tasks on GitHub

  • Workflows are defined in YAML files inside .github/workflows/

  • Events/Triggers

    • on: push
    • on: pull_request
    • on: schedule for cron jobs
  • Runners
    • Virtual machines hosted by GitHub that execute your
    • Many different architectures and operating systems are available
  • Jobs & Steps
    • A workflow has jobs which run in parallel unless there are dependencies
    • Jobs have steps which run sequentially

Common Maintenance Workflows

  • Matrix Builds

    • Running the exact same test suite on Ubuntu, Windows, and macOS simultaneously using a matrix strategy to catch OS-specific bugs
  • Compilation of code when changes are pushed (e.g. for CD)

  • Dependabot

    • GitHub’s native security screener to automatically open PRs when your dependencies have security vulnerabilities or out-of-date versions

Docker

Dependencies can be a pain to manage, especially for less technical collaborators

  • Global system dependencies
  • Mismatched Python/Node versions
  • OS differences

Virtual environments (like venv or npm) help, but they don’t capture OS-level dependencies (like C++ compilers or database drivers)

Containers

  • Before shipping containers, loading a cargo ship took days of packing weirdly shaped items
  • Standard containers mean standard cranes and standard ships
  • Docker is standard packaging for code

Containers vs VMs

  • VMs emulate the whole hardware and OS (heavy)
  • Containers share the host OS kernel and only isolate the app and its libraries (lightweight and fast)

Docker Basics

  • Images vs. Containers
    • An Image is the recipe/blueprint
    • A Container is the running instance of that recipe

Sample Dockerfile

FROM python:3.10-slim # Starts with a tiny Linux environment pre-loaded with Python 3.10
RUN apt-get update && \
    apt-get install -y samtools # install samtools

WORKDIR /app # Creates a folder called `/app` inside the container and moves into it
COPY requirements.txt . # Copy your Python dependency definitions into the image

RUN pip install -r requirements.txt # Installs Python packages (e.g. biopython, pandas) during the build process

COPY . . # Copy your actual analysis scripts into the image

CMD ["python", "analyze_sequences.py"] # The default command that executes when the container starts

.dockerignore

  • Exclude node_modules, .git, data and environment variable files (.env)
  • Helps keep images small
  • Helps avoid security issues

Benefits for maintainers

Example: Reviewers can pull a PR, type docker compose up, and test a complex app with a database instantly, without installing a database or other dependencies on their local machine