27  Package Testing with Python pytest

We’ve seen Python testing before and used pytest to run all our function tests. But now let’s see how we can use pytest when we are working with a python package.

27.1 Simple Example Package with pytest testing

We’ll be using the pytest and cookiecutter package to set up an example python package and write tests to be run by pytest. But first, remember to create a new environment for the packages.

Also remember you need to have poetry installed as well.

# create and install packages
conda create -n countchar -c conda-forge python pytest cookiecutter

# remember to activate the environment
conda activate countchar

Next use cookiecutter to create a package template

# create the cookiecutter package example
cookiecutter https://github.com/py-pkgs/py-pkgs-cookiecutter.git
author_name [Monty Python]: Daniel Chen
package_name [mypkg]: countchar
package_short_description []: Python package using pytest example
package_version [0.1.0]:
python_version [3.9]:
Select open_source_license:
1 - MIT
2 - Apache License 2.0
3 - GNU General Public License v3.0
4 - Creative Commons Attribution 4.0
5 - BSD 3-Clause
6 - Proprietary
7 - None
Choose from 1, 2, 3, 4, 5, 6 [1]:
Select include_github_actions:
1 - no
2 - ci
3 - ci+cd
Choose from 1, 2, 3 [1]:

27.1.1 Create a package module

In your src folder, we will create a count_letters function that will take a string and return the number of characters in the string. We will be following a test-driven development workflow so we’ll create a function skeleton first, and then write the tests for the function.

Let’s create a count_char.py file in the src directory

countchar
|
β”œβ”€β”€ ...
|
β”œβ”€β”€ src
β”‚   └── countchar
|       └── count_char.py
def count_char(input_string):
    """Count the number of characters in a string.

    Parameters
    ----------
    input_string : str
        The input string whose characters will be counted.

    Returns
    -------
    int
        The number of characters in the input string.

    Examples
    --------
    >>> count_char("hello")
    5
    >>> count_char("")
    0
    >>> count_char("Python is cool")
    14
    """
    pass
Note

We are using pass in the function body so the function has something in it. You cannot define a python function with an empty function body.

27.1.2 Create tests in the test directory

Next, we will create our tests in the test directory. Remember, that pytest will look for tests in the test directory, look for files that either begin or end with test, and run all the functions that begin or end with test in that file.

countchar
|
β”œβ”€β”€ ...
|
β”œβ”€β”€ src
β”‚   └── countchar
|       └── count_char.py
└── tests
    └── test_count_char.py

We will create our test that match the examples we created.

from countchar.count_char import count_char

def test_count_char():

    string = "hello"
    expected = 5
    actual = count_char(string)
    assert actual == expected

    string = ""
    expected = 0
    actual = count_char(string)
    assert actual == expected

    string = "Python is cool"
    expected = 14
    actual = count_char(string)
    assert actual == expected

Before running our test, we will make sure that our pyproject.toml file will list pytest as a development dependency for our package.

poetry add --group dev pytest

And don’t forget to β€œinstall” our package.

poetry install

Next, we can go and run pytest

pytest tests/

Since we have not implemented our function all these tests will fail.

$ pytest
=================================================== test session starts ====================================================
platform linux -- Python 3.11.9, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/dan/temp/countchar
configfile: pyproject.toml
plugins: anyio-4.4.0
collected 1 item

tests/test_countchar.py F                                                                                            [100%]

========================================================= FAILURES =========================================================
_____________________________________________________ test_count_char ______________________________________________________

    def test_count_char():

        string = "hello"
        expected = 5
        actual = count_char(string)
>       assert actual == expected
E       assert None == 5

tests/test_countchar.py:8: AssertionError
================================================= short test summary info ==================================================
FAILED tests/test_countchar.py::test_count_char - assert None == 5
==================================================== 1 failed in 0.13s =====================================================

Let’s go and implement the function.

# our count char function without the docstring to save vertical space
def count_char(input_string):
    return len(input_string)

and then run pytest again.

pytest tests/
$ pytest tests/
=================================================== test session starts ====================================================
platform linux -- Python 3.11.9, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/dan/temp/countchar
configfile: pyproject.toml
plugins: anyio-4.4.0
collected 1 item

tests/test_countchar.py .                                                                                            [100%]

==================================================== 1 passed in 0.02s =====================================================

27.1.3 Change the implementation logic

The benefit of setting up unit tests, is that if we change something with the implementation, we can still be sure the functional unit behaves as we expected. For example, we can re-write the counting implementation to use a for loop, and then still expect all our tests to pass.

# count_char implementation using a for loop
def count_char(input_string):
    count = 0
    for char in input_string:
        count += 1
    return count

27.2 Test for exceptions

Now that we have the basis for our pytest unit testing working, we may want to add additional input checks into our function, and raise an error if we can can catch any issues early.

# add an input type check to our function
def count_char(input_string):
    if not isinstance(input_string, str):
        raise TypeError(f"Expected input to be str, got {type(input_string)}")
    len(input_string)

Next we can add another test for pytest that checks the exception.

import pytest

from countchar.count_char import count_char


def test_count_char_wrong_input():
    with pytest.raises(TypeError):
        count_char(123)

27.3 Reduce repeated test code

There are a few techniques you can use if you find yourself re-using code in your tests.

  • Fixtures: provide a mechanism to share setup code between individual tests. For example, if you need to create, load, or connect to the same data source over and over without wanting to copy and paste the same setup code
  • Parameterizations: when the actual test code logic is the same, but the only part that is changing are a few parameters between individual tests. This is useful to consolidate all the same testing logic together, without having to create totally separate test functions that are mostly the same
  • conftest.py: this file allows you to re-use and share the same text fixtures across multiple test modules. For example, you have different functions but need the same example data set and fixture setup.

27.3.1 Fixtures

Fixtures are useful if you know you will be reusing code that will be be used across multiple tests.

Let’s create a new file that contains multiple lines of text. We will then write tests for each line of the file.

Save the below text into the tests/text.txt file.

This is the first line of the file
the second
and 3rd

Our package should look like this:

countchar
|
β”œβ”€β”€ ...
|
β”œβ”€β”€ src
β”‚   └── countchar
|       └── count_char.py
└── tests
    └── test_count_char.py
    └── text.txt

Now let’s write a separate test function for each line of our file.

import pytest

from countchar.count_char import count_char


def test_line_1():
    file_path = "tests/text.txt"
    with open(file_path, 'r') as file:
        lines = file.readlines()
    string = lines[0].strip()
    assert count_char(string) == 34

def test_line_2():
    file_path = "tests/text.txt"
    with open(file_path, 'r') as file:
        lines = file.readlines()
    string = lines[1].strip()
    assert count_char(string) == 10

def test_line_3():
    file_path = "tests/text.txt"
    with open(file_path, 'r') as file:
        lines = file.readlines()
    string = lines[2].strip()
    assert count_char(string) == 7
Note

We need the .strip() to remove the trailing new line break

You can see here in the testing code above we are repeatedly running the same code that reads in the text file.

    file_path = "tests/text.txt"
    with open(file_path, 'r') as file:
        lines = file.readlines()

We can reduce this duplication with fixtures.

import pytest

from countchar.count_char import count_char

@pytest.fixture
def text_data():
    file_path = "tests/text.txt"
    with open(file_path, 'r') as file:
        lines = file.readlines()
    return lines


def test_line_1(text_data):
    string = text_data[0].strip()
    assert count_char(string) == 34

def test_line_2(text_data):
    string = text_data[1].strip()
    assert count_char(string) == 10

def test_line_3(text_data):
    string = text_data[2].strip()
    assert count_char(string) == 7

27.3.2 Parameterizations

Parameterizations are useful when you have to run the same test with different arguments. In our current example, when we are using our tests/text.txt file to read in the lines of the file to do our character counts, the only real difference is the actual line we are trying to read in.

We can use the pytest @pytest.mark.parametrize(argnames, argvalues) decorator to create parameterizations.

import pytest

from countchar.count_char import count_char

@pytest.fixture
def text_data():
    file_path = "tests/text.txt"
    with open(file_path, 'r') as file:
        lines = file.readlines()
    return lines


@pytest.mark.parametrize(
    "test_data_line_number, expected_count",
    [
        (0, 34),
        (1, 10),
        (2, 7)
    ]
)
def test_line_n(text_data, test_data_line_number, expected_count):
    string = text_data[test_data_line_number].strip()
    assert count_char(string) == expected_count
Note

When you are passing in multiple argnames into @pytest.mark.parametrize, they are all listed as a single comma separated string.

27.3.3 Setup: Add another package module

Let’s make another module named count_letters.py that will take any string and return the number of letters in the string without symbols or spaces. Ideally the code will work for any arbitrary language.

# code adapted from a ChatGPT answer
import unicodedata


def count_letters(input_string):
    """
    Counts the number of letters in a string, excluding spaces, numbers, and symbols.

    Parameters
    ----------
    input_string : str
        The input string to count.

    Returns
    -------
    int
        The number of "letters" in the input string.

    Examples
    --------
    # counts the string 'Hello, world! 123'
    >>> count_letters("Hello, δΈ–η•Œ! 123")
    7
    """

    # categories starting with 'L' are letters.
    # e.g., 'Lu' for uppercase letter, 'Ll' for lowercase letter
    is_letter = [
        True for char in input_string if unicodedata.category(char).startswith("L")
    ]
    return sum(is_letter)
print(count_letters("Hello, δΈ–η•Œ! 123"))
7

Our package now has 2 modules, and we’ll have 2 separate testing modules.

countchar
|
β”œβ”€β”€ ...
|
β”œβ”€β”€ src
β”‚   └── countchar
|       └── count_char.py
|       └── count_letters.py

└── tests
    └── test_count_char.py
    └── test_count_letters.py
    └── text.txt

27.3.4 conftest.py to share fixtures

Now that we have 2 separate testing modules, we may want to share the same file reading fixture across both modules. We can use conftest.py to share fixtures across different testing modules.

We will move the current text fixture from count_char.py and paste it into a conftest.py file.

countchar
|
β”œβ”€β”€ ...
|
β”œβ”€β”€ src
β”‚   └── countchar
|       └── count_char.py
|       └── count_letters.py

└── tests
    └── conftest.py
    └── test_count_char.py
    └── test_count_letters.py
    └── text.txt

The contests of the files in our tests directory are as follows:

conftest.py:

import pytest


@pytest.fixture
def text_data():
    file_path = "tests/text.txt"
    with open(file_path, "r") as file:
        lines = file.readlines()
    return lines

test_count_char.py:

import pytest

from countchar.count_char import count_char


@pytest.mark.parametrize(
    "test_data_line_number, expected_count", [(0, 34), (1, 10), (2, 7)]
)
def test_line_n(text_data, test_data_line_number, expected_count):
    string = text_data[test_data_line_number].strip()
    assert count_char(string) == expected_count

test_count_letters.py:

import pytest

from countchar.count_letters import count_letters


@pytest.mark.parametrize(
    "test_data_line_number, expected_count", [(0, 27), (1, 9), (2, 5)]
)
def test_line_n(text_data, test_data_line_number, expected_count):
    string = text_data[test_data_line_number].strip()
    assert count_letters(string) == expected_count

27.4 More pytest features

See the py-pkgs testing chatper to learn more about testing andpytest features (e.g., regression testing, and parameterization)