46 Continuous Deployment: Python

46.1 Using GitHub Actions to perform CD for your Python package

We will be building off what we learned last class about continuous integration with GitHub actions for Python. What we need to change to make a continuous deployment work for our package?

setup correct permissions for cd job steps
add a conditional for when to run the job (only if ci passes, and only if this is a commit to the main branch)
ensure the runner uses only one machine - ubuntu
bump the version
build our package
publish to TestPyPI and PyPI
create a release on GitHub that corresponds to that version

46.2 Storing and use GitHub Actions credentials safely via GitHub Secrets

Some of the tasks we want to do in our workflows require authentication. However, the whole point of this is to automate this process - so how can we do that without sharing our authentication tokens, usernames or passwords in our workflow files?

GitHub Secrets is the solution to this!

GitHub Secrets are encrypted environment variables that are used only with GitHub Actions, and specified on a repository-by-repository basis. They can be accessed in a workflow file via: ${{ secrets.SECRET_NAME }}

See GitHub’s help docs for how to do this: https://help.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets

Exercise

Add a secret to your pycounts* GitHub repository

Let’s learn how to add secrets to a GitHub repository. We’ll do this by adding our TEST_PYPI_API_TOKEN and PYPI_API_TOKEN as a secret to our pypkgs GitHub repository so that we can automate the publishing of our package repositories.

Steps:

Under “Secrets”, click on “Settings and variables” > “Actions”
Under “Repository secrets” click “New repository secret”
Add TEST_PYPI_API_TOKEN as the secret name, and paste your token (which you previously saved from TestPyPI to a password manager) as the value.
Repeat the steps above to add PYPI_API_TOKEN as a secret

46.3 Authenticating with the `GITHUB_TOKEN`

What if you need to do Git/GitHub things in your workflow? Like checkout your files to run the tests? Create a release? Open an issue? To help with this GitHub automatically (i.e., you do not need to create this secret) creates a secret named GITHUB_TOKEN that you can access and use in your workflow. You access this token in your workflow file via:

${{ secrets.GITHUB_TOKEN }}

46.4 Setting permissions in the workflow file

In addition to the tokens above, the cd job needs to have write permissions to do three things:

Token authentication
Edit the version number in pyproject.toml
Create a new software release on GitHub

To do this we will use the permissions keyword in ci-cd.yml at the beginning of the cd job, setting both id-token and contentstowrite`:

 cd:
    permissions:
      id-token: write
      contents: write

46.5 Conditionals for when to run the job

We only want our cd job to run if certain conditions are true, these are:

if the ci job passes
if this is a commit to the main branch

We can accomplish this in our cd job be writing a conditional using the needs and if keywords at the top of the job, right after we set the permissions:

cd:
    permissions:
      id-token: write
      contents: write

    # Only run this job if the "ci" job passes
    needs: ci

    # Only run this job if new work is pushed to "main"
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'

Exercise

read the cd job of ci-cd.yml

To make sure we understand what is happening in our workflow that performs CD, let’s convert each step to a human-readable explanation:

Sets up Python on the runner
Checkout our repository files from GitHub and put them on the runner
…
…
…
…
…

Note: I filled in the steps we went over last class, so you can just fill in the new stuff

46.5.1 How can we automate version bumping?

Let’s look at the first step that works towards accomplishing this:

      - name: Use Python Semantic Release to prepare release
        id: release
        uses: python-semantic-release/python-semantic-release@v8.3.0
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}

Python semantic-release is a Python tool which parses commit messages looking for keywords to indicate how to bump the version. It bumps the version in the pyproject.toml file.

To understand how it works so that we can use it, we need to understand semantic versioning and how to write conventional commit messages.

Let’s unpack each of these on its own.

46.6 Semantic versioning

When we make changes and publish new versions of our packages, we should tag these with a version number so that we and others can view and use older versions of the package if needed.
These version numbers should also communicate something about how the underlying code has changed from one version to the next.
Semantic versioning is an agreed upon “code” by developers that gives meaning to version number changes, so developers and users can make meaningful predictions about how code changes between versions from looking solely at the version numbers.
Semantic versioning assumes version 1.0.0 defines the API, and the changes going forward use that as a starting reference.

46.7 Semantic versioning

Given a version number MAJOR.MINOR.PATCH (e.g., 2.3.1), increment the:

MAJOR version when you make incompatible API changes (often called breaking changes 💥)
MINOR version when you add functionality in a backwards compatible manner ✨↩︎️
PATCH version when you make backwards compatible bug fixes 🐞

Source: https://semver.org/

46.7.1 Semantic versioning case study

Case 1: In June 2009, Python bumped versions from 3.0.1, some changes in the new release included: - Addition of an ordered dictionary type - A pure Python reference implementation of the import statement - New syntax for nested with statements

Case 2: In Dec 2017, Python bumped versions from 3.6.3, some changes in the new release included:

Fixed several issues in printing tracebacks (PyTraceBack_Print()).
Fix the interactive interpreter looping endlessly when no memory.
Fixed an assertion failure in Python parser in case of a bad unicodedata.normalize()

Case 3: In Feb 2008, Python bumped versions from 2.7.17, some changes in the new release included: - print became a function - integer division resulted in creation of a float, instead of an integer - Some well-known APIs no longer returned lists (e.g., dict.keys, dict.values, map)

Exercise

Name that semantic version release

Reading the three cases posted above, think about whether each should be a major, minor or patch version bump. Answer the chat when prompted.

46.8 Conventional commit messages

Python Semantic Release by default uses a parser that works on the conventional (or Angular) commit message style, which is:

<type>(optional scope): succinct description of the change

(optional body: the motivation for the change and contrast this with previous behavior)

(optional footer: note BREAKING CHANGES here, as well as any issues to be closed)

How to affect semantic versioning with conventional commit messages: - a commit with the type fix leads to a patch version bump - a commit with the type feat leads to a minor version bump - a commit with a body or footer that starts with BREAKING CHANGE: - these can be of any type (Note: currently Python Semantic release is broken for detecting these on commits with Windows line endings, wich the GitHub pen tool commits also use. The workaround fix is to use ! after feat, for example: feat!: This describes the new feature and breaking changes in addition to BREAKING CHANGES: ... in the footer.)

Note

commit types other than fix and feat are allowed. Recommeneded ones include docs, style, refactor, test, ci and others. However, only fix and feat result in version bumps using Python Semantic Release.

46.8.1 An example of a conventional commit message

git commit -m "feat(function_x): added the ability to initialize a project even if a pyproject.toml file exists"

What kind of version bump would this result in?

46.8.2 Another example of a conventional commit message

git commit -m "feat!: change to use of `%>%` to add new layers to ggplot objects

BREAKING CHANGE: `+` operator will no longer work for adding new layers to ggplot objects after this release"

What kind of version bump would this result in?

46.8.3 Some practical notes for usage in your packages:

You must add the following to the tool section of your pyproject.toml file for this to work (note: the pypkgs-cookiecutter adds this table if you choose to add ci-cd when you set it up):

[tool.semantic_release]
version_toml = [
    "pyproject.toml:tool.poetry.version",
]                                                    # version location
branch = "main"                                      # branch to make releases of
changelog_file = "CHANGELOG.md"                      # changelog file
build_command = "pip install poetry && poetry build" # build dists

Versions will not be bumped if conventional commits are not used.

46.8.4 Some practical notes for usage in your packages:

Automated version bumping can only work (as currently implemented in our cookiecutter) with versions in the pyproject.toml metadata (line 3). If you add a version elsewhere, it will not get bumped unless you specify the location the the [tool.semantic_release] table in pyproject.toml.
If you have been working with main branch protection, you will need to change something to use ci.yml work for continuous deployment. The reason for this, is that this workflow (which bumps versions and deploy the package) is triggered to run after the pull request is merged to main. Therefore, when we bump the versions in the pyproject.toml file we need to push these changes to the main branch - however this is problematic given that we have set-up main branch protection!

What are we to do about #2?

Solution 1:

Remove main branch protection. This is not the most idealistic solution, however it is a simple and practical one.

Possible solution 2:

(I say possible because this has yet to be formally documented by PSR, and is still just reported in an issue: https://github.com/python-semantic-release/python-semantic-release/issues/311. I have tested it and it works for me, see example here: https://github.com/ttimbers/pycounts_tt_2024/blob/main/.github/workflows/ci-cd.yml)

Create a new GitHub PAT (see these docs) with the repo scope.
Under “Secrets”, click on “Settings and variables” > “Actions”, and then under “Repository secrets” click “New repository secret” to create a new repository secret named RELEASE_TOKEN and the new GitHub PAT with repo scope as its value.
Edit the to cd job steps shown below (from ci-cd.yml) use this token:

...
    - uses: actions/checkout@v3
      with:
        fetch-depth: 0
        token: ${{ secrets.RELEASE_TOKEN }}
...

    - name: Use Python Semantic Release to prepare release
      id: release
      uses: python-semantic-release/python-semantic-release@v8.3.0
      with:
        github_token: ${{ secrets.RELEASE_TOKEN }}
...

46.8.5 Demo of Continuous Deployment!

https://github.com/ttimbers/pycounts_tt_2024

46.9 Publishing your Python package

46.9.1 Level 1: GitHub

Packages can be installed from GitHub via pip:

pip install git+https://github.com/USERNAME/REPOSITORY.git

46.9.2 Level 2: PyPI

Packages can be installed from PyPI via:

pip install PACKAGE_NAME

should be pronounced like “pie pea eye”
also known as the Cheese Shop (a reference to the Monty Python’s Flying Circus sketch “Cheese Shop”)

Because level 2 is so easy, it is the most commonly used method.

from IPython.display import YouTubeVideo
YouTubeVideo('zB8pbUW5n1g')

Don’t get the joke? I didn’t either without historical context. When PyPI was first launched it didn’t have many Python packages on it - similar to a cheese shop with no cheese 😆

46.9.3 the Cheese Shop (er, PyPI)

PyPI (founded in 2002) stands for the “Python Package Index”
hosts Python packages of two different forms:
- sdists (source distributions)
- precompiled “wheels (binaries)
heavily cached and distributed
currently contains > 9000 projects

46.9.4 Number of packages hosted by PyPI over history

Source: “Ecosystem-level determinants of sustained activity in open-source projects: a case study of the PyPI ecosystem” by Marat Valiev, Bogdan Vasilescu & James Herbsleb

46.9.5 What does it mean to be a PyPI package:

Ease of installation: - can be installed by users via pip install (it’s actually the default!) - universal binaries available for packages that are written solely in Python

Discoverability: - listed as a package on PyPI

HOWEVER, there is no required check for your package is required to pass… As long as you can bundle it as something that PyPI recognizes as an sdist or wheels then it can go on PyPI… This allows the process to be fully automated, but QC is lower than it is for CRAN.

46.9.6 How to submit a package to PyPI

See above, or the “How to package a Python” chapter of the Python Packages book.

46.9.7 How to submit a package to Conda

See the “Publish your Python package that is on PyPI to conda-forge” chapter of the Python packaging 101 documentation from PyOpenSci.

46.9.8 Points for discussion

Is one model better or worse?
Importance & complementarity of organizations like rOpenSci & pyOpenSci with CRAN and PyPI, respectively

46.9.9 Semantic versioning case study - answers

In 2008, Python bumped versions from 2.7.17 to 3.0.0. Some changes in the 3.0.0 release included: - print became a function - integer division resulted in creation of a float, instead of an integer - Some well-known APIs no longer returned lists (e.g., dict.keys, dict.values, map) - and many more (see here if interested)

Source

In 2009, Python bumped versions from 3.0.1 to 3.1.0. Some changes in the 3.1.0 release included: - Addition of an ordered dictionary type - A pure Python reference implementation of the import statement - New syntax for nested with statements

Source

In 2017, Python bumped versions from 3.6.3 to 3.6.4. Some changes in the 3.6.4 release included:

Fixed several issues in printing tracebacks (PyTraceBack_Print()).
Fix the interactive interpreter looping endlessly when no memory.
Fixed an assertion failure in Python parser in case of a bad unicodedata.normalize()

Source

46.1 Using GitHub Actions to perform CD for your Python package

46.2 Storing and use GitHub Actions credentials safely via GitHub Secrets

Steps:

46.3 Authenticating with the GITHUB_TOKEN

46.4 Setting permissions in the workflow file

46.5 Conditionals for when to run the job

46.5.1 How can we automate version bumping?

46.6 Semantic versioning

46.7 Semantic versioning

46.7.1 Semantic versioning case study

46.8 Conventional commit messages

46.8.1 An example of a conventional commit message

46.8.2 Another example of a conventional commit message

46.8.3 Some practical notes for usage in your packages:

46.8.4 Some practical notes for usage in your packages:

Solution 1:

Possible solution 2:

46.8.5 Demo of Continuous Deployment!

46.9 Publishing your Python package

46.9.1 Level 1: GitHub

46.9.2 Level 2: PyPI

46.9.3 the Cheese Shop (er, PyPI)

46.9.4 Number of packages hosted by PyPI over history

46.9.5 What does it mean to be a PyPI package:

46.9.6 How to submit a package to PyPI

46.9.7 How to submit a package to Conda

46.9.8 Points for discussion

46.9.9 Semantic versioning case study - answers

46.3 Authenticating with the `GITHUB_TOKEN`