from IPython.display import YouTubeVideo
'zB8pbUW5n1g') YouTubeVideo(
36 Continuous Deployment: Python
36.1 Using GitHub Actions to perform CD for your Python package
We will be building off what we learned last class about continuous integration with GitHub actions for Python. What we need to change to make a continuous deployment work for our package?
setup correct permissions for
cd
job stepsadd a conditional for when to run the job (only if
ci
passes, and only if this is a commit to themain
branch)ensure the runner uses only one machine - ubuntu
bump the version
build our package
publish to TestPyPI and PyPI
create a release on GitHub that corresponds to that version
36.2 Storing and use GitHub Actions credentials safely via GitHub Secrets
Some of the tasks we want to do in our workflows require authentication. However, the whole point of this is to automate this process - so how can we do that without sharing our authentication tokens, usernames or passwords in our workflow files?
GitHub Secrets is the solution to this!
GitHub Secrets are encrypted environment variables that are used only with GitHub Actions, and specified on a repository-by-repository basis. They can be accessed in a workflow file via: ${{ secrets.SECRET_NAME }}
See GitHub’s help docs for how to do this: https://help.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets
Add a secret to your pycounts*
GitHub repository
Let’s learn how to add secrets to a GitHub repository. We’ll do this by adding our TEST_PYPI_API_TOKEN
and PYPI_API_TOKEN
as a secret to our pypkgs
GitHub repository so that we can automate the publishing of our package repositories.
Steps:
Under “Secrets”, click on “Settings and variables” > “Actions”
Under “Repository secrets” click “New repository secret”
Add
TEST_PYPI_API_TOKEN
as the secret name, and paste your token (which you previously saved from TestPyPI to a password manager) as the value.Repeat the steps above to add
PYPI_API_TOKEN
as a secret
36.3 Authenticating with the GITHUB_TOKEN
What if you need to do Git/GitHub things in your workflow? Like checkout your files to run the tests? Create a release? Open an issue? To help with this GitHub automatically (i.e., you do not need to create this secret) creates a secret named GITHUB_TOKEN
that you can access and use in your workflow. You access this token in your workflow file via:
${{ secrets.GITHUB_TOKEN }}
36.4 Setting permissions in the workflow file
In addition to the tokens above, the cd
job needs to have write permissions to do three things:
- Token authentication
- Edit the version number in
pyproject.toml
- Create a new software release on GitHub
To do this we will use the permissions
keyword in ci-cd.yml
at the beginning of the cd
job, setting both id-token
and contentsto
write`:
cd:
permissions:
id-token: write
contents: write
36.5 Conditionals for when to run the job
We only want our cd
job to run if certain conditions are true, these are:
if the
ci
job passesif this is a commit to the
main
branch
We can accomplish this in our cd
job be writing a conditional using the needs
and if
keywords at the top of the job, right after we set the permissions:
cd:
permissions:
id-token: write
contents: write
# Only run this job if the "ci" job passes
needs: ci
# Only run this job if new work is pushed to "main"
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
read the cd
job of ci-cd.yml
To make sure we understand what is happening in our workflow that performs CD, let’s convert each step to a human-readable explanation:
Sets up Python on the runner
Checkout our repository files from GitHub and put them on the runner
…
…
…
…
…
Note: I filled in the steps we went over last class, so you can just fill in the new stuff
36.5.1 How can we automate version bumping?
Let’s look at the first step that works towards accomplishing this:
- name: Use Python Semantic Release to prepare release
id: release
uses: python-semantic-release/python-semantic-release@v8.3.0
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
Python semantic-release is a Python tool which parses commit messages looking for keywords to indicate how to bump the version. It bumps the version in the pyproject.toml
file.
To understand how it works so that we can use it, we need to understand semantic versioning and how to write conventional commit messages.
Let’s unpack each of these on its own.
36.6 Semantic versioning
When we make changes and publish new versions of our packages, we should tag these with a version number so that we and others can view and use older versions of the package if needed.
These version numbers should also communicate something about how the underlying code has changed from one version to the next.
Semantic versioning is an agreed upon “code” by developers that gives meaning to version number changes, so developers and users can make meaningful predictions about how code changes between versions from looking solely at the version numbers.
Semantic versioning assumes version 1.0.0 defines the API, and the changes going forward use that as a starting reference.
36.7 Semantic versioning
Given a version number MAJOR.MINOR.PATCH
(e.g., 2.3.1
), increment the:
- MAJOR version when you make incompatible API changes (often called breaking changes 💥)
- MINOR version when you add functionality in a backwards compatible manner ✨↩︎️
- PATCH version when you make backwards compatible bug fixes 🐞
Source: https://semver.org/
36.7.1 Semantic versioning case study
Case 1: In June 2009, Python bumped versions from 3.0.1, some changes in the new release included: - Addition of an ordered dictionary type - A pure Python reference implementation of the import statement - New syntax for nested with statements
Case 2: In Dec 2017, Python bumped versions from 3.6.3, some changes in the new release included:
- Fixed several issues in printing tracebacks (
PyTraceBack_Print()
). - Fix the interactive interpreter looping endlessly when no memory.
- Fixed an assertion failure in Python parser in case of a bad
unicodedata.normalize()
Case 3: In Feb 2008, Python bumped versions from 2.7.17, some changes in the new release included: - print
became a function - integer division resulted in creation of a float, instead of an integer - Some well-known APIs no longer returned lists (e.g., dict.keys
, dict.values
, map
)
Name that semantic version release
Reading the three cases posted above, think about whether each should be a major, minor or patch version bump. Answer the chat when prompted.
36.8 Conventional commit messages
Python Semantic Release by default uses a parser that works on the conventional (or Angular) commit message style, which is:
<type>(optional scope): succinct description of the change
(optional body: the motivation for the change and contrast this with previous behavior)
(optional footer: note BREAKING CHANGES here, as well as any issues to be closed)
How to affect semantic versioning with conventional commit messages: - a commit with the type fix
leads to a patch version bump - a commit with the type feat
leads to a minor version bump - a commit with a body or footer that starts with BREAKING CHANGE:
- these can be of any type (Note: currently Python Semantic release is broken for detecting these on commits with Windows line endings, wich the GitHub pen tool commits also use. The workaround fix is to use !
after feat
, for example: feat!: This describes the new feature and breaking changes
in addition to BREAKING CHANGES: ...
in the footer.)
commit types other than fix
and feat
are allowed. Recommeneded ones include docs
, style
, refactor
, test
, ci
and others. However, only fix
and feat
result in version bumps using Python Semantic Release.
36.8.1 An example of a conventional commit message
git commit -m "feat(function_x): added the ability to initialize a project even if a pyproject.toml file exists"
What kind of version bump would this result in?
36.8.2 Another example of a conventional commit message
git commit -m "feat!: change to use of `%>%` to add new layers to ggplot objects
BREAKING CHANGE: `+` operator will no longer work for adding new layers to ggplot objects after this release"
What kind of version bump would this result in?
36.8.3 Some practical notes for usage in your packages:
You must add the following to the tool section of your
pyproject.toml
file for this to work (note: thepypkgs-cookiecutter
adds this table if you choose to addci-cd
when you set it up):[tool.semantic_release] version_toml = [ "pyproject.toml:tool.poetry.version", ] # version location branch = "main" # branch to make releases of changelog_file = "CHANGELOG.md" # changelog file build_command = "pip install poetry && poetry build" # build dists
Versions will not be bumped if conventional commits are not used.
36.8.4 Some practical notes for usage in your packages:
Automated version bumping can only work (as currently implemented in our cookiecutter) with versions in the
pyproject.toml
metadata (line 3). If you add a version elsewhere, it will not get bumped unless you specify the location the the[tool.semantic_release]
table inpyproject.toml
.If you have been working with main branch protection, you will need to change something to use
ci.yml
work for continuous deployment. The reason for this, is that this workflow (which bumps versions and deploy the package) is triggered to run after the pull request is merged to main. Therefore, when we bump the versions in thepyproject.toml
file we need to push these changes to the main branch - however this is problematic given that we have set-up main branch protection!
What are we to do about #2?
Solution 1:
Remove main branch protection. This is not the most idealistic solution, however it is a simple and practical one.
Possible solution 2:
(I say possible because this has yet to be formally documented by PSR, and is still just reported in an issue: https://github.com/python-semantic-release/python-semantic-release/issues/311. I have tested it and it works for me, see example here: https://github.com/ttimbers/pycounts_tt_2024/blob/main/.github/workflows/ci-cd.yml)
Create a new GitHub PAT (see these docs) with the repo scope.
Under “Secrets”, click on “Settings and variables” > “Actions”, and then under “Repository secrets” click “New repository secret” to create a new repository secret named
RELEASE_TOKEN
and the new GitHub PAT with repo scope as its value.Edit the to
cd
job steps shown below (fromci-cd.yml
) use this token:
...
- uses: actions/checkout@v3
with:
fetch-depth: 0
token: ${{ secrets.RELEASE_TOKEN }}
...
- name: Use Python Semantic Release to prepare release
id: release
uses: python-semantic-release/python-semantic-release@v8.3.0
with:
github_token: ${{ secrets.RELEASE_TOKEN }}
...
36.8.5 Demo of Continuous Deployment!
36.9 Publishing your Python package
36.9.1 Level 1: GitHub
Packages can be installed from GitHub via pip
:
pip install git+https://github.com/USERNAME/REPOSITORY.git
36.9.2 Level 2: PyPI
Packages can be installed from PyPI
via:
pip install PACKAGE_NAME
- should be pronounced like “pie pea eye”
- also known as the Cheese Shop (a reference to the Monty Python’s Flying Circus sketch “Cheese Shop”)
Because level 2 is so easy, it is the most commonly used method.
Don’t get the joke? I didn’t either without historical context. When PyPI was first launched it didn’t have many Python packages on it - similar to a cheese shop with no cheese 😆
36.9.3 the Cheese Shop (er, PyPI)
PyPI (founded in 2002) stands for the “Python Package Index”
hosts Python packages of two different forms:
- sdists (source distributions)
- precompiled “wheels (binaries)
heavily cached and distributed
currently contains > 9000 projects
36.9.4 Number of packages hosted by PyPI over history
Source: “Ecosystem-level determinants of sustained activity in open-source projects: a case study of the PyPI ecosystem” by Marat Valiev, Bogdan Vasilescu & James Herbsleb
36.9.5 What does it mean to be a PyPI package:
Ease of installation: - can be installed by users via pip install
(it’s actually the default!) - universal binaries available for packages that are written solely in Python
Discoverability: - listed as a package on PyPI
HOWEVER, there is no required check for your package is required to pass… As long as you can bundle it as something that PyPI recognizes as an sdist or wheels then it can go on PyPI… This allows the process to be fully automated, but QC is lower than it is for CRAN.
36.9.6 How to submit a package to PyPI
See the “How to package a Python” chapter of the Python Packages book.
36.9.7 Points for discussion
Is one model better or worse?
Importance & complementarity of organizations like rOpenSci & pyOpenSci with CRAN and PyPI, respectively
36.9.8 Semantic versioning case study - answers
In 2008, Python bumped versions from 2.7.17 to 3.0.0. Some changes in the 3.0.0 release included: - print
became a function - integer division resulted in creation of a float, instead of an integer - Some well-known APIs no longer returned lists (e.g., dict.keys
, dict.values
, map
) - and many more (see here if interested)
In 2009, Python bumped versions from 3.0.1 to 3.1.0. Some changes in the 3.1.0 release included: - Addition of an ordered dictionary type - A pure Python reference implementation of the import statement - New syntax for nested with statements
In 2017, Python bumped versions from 3.6.3 to 3.6.4. Some changes in the 3.6.4 release included:
- Fixed several issues in printing tracebacks (
PyTraceBack_Print()
). - Fix the interactive interpreter looping endlessly when no memory.
- Fixed an assertion failure in Python parser in case of a bad
unicodedata.normalize()