Ultimate Python Tox Guide with Practical Examples with MyPy and PyTest

Why do you need TOX for your Python projects?

At the end of this tutorial you know how to test your code against multiple versions of Python, make static type testing, and make some simple unit test. This is all done with the Python tox framework.

This tutorial will teach you the follwoing.

  • What is tox and why you should use it?
  • How you can add unit tests with pytest.
  • What is mypy and how you can use it?
Watch tutorial

Step 1: What is tox?

“It works on my machine!”

You just wrote this awesome program and a friend is trying it on her machine. Unfortunately, it doesn’t work.

This is a well-known pain point, and you have no idea why it is the case and you end up saying, it works on my machine!

As you already know, there might be many reasons – different Python versions, and different library versions can play a factor.

Your goal is to deploy our code to Docker containers, but before that, we need to learn some good tools to help you get there easily and not have unexpected pain points when you deploy it.

This is where the Python tox framework can help you.

tox aims to automate and standardize testing in Python. It is part of a larger vision of easing the packaging, testing, and release process of Python software.” (

Maybe you understand that – I don’t.

What tox does explained in human understandable language

What it does is – simply explained – it creates new virtual environments and tests the code.

Why do we need to do that?

  • Say, you write the code and add the requirements.txt file in the GitHub repository.
  • Later someone else clones the code and installs the requirements.txt but it does not work.

There can be many reasons for that.

  • There might be missing some environment variables.
  • A setup script might be needed.
  • Libraries missing in requirements.txt.
  • Different Python versions can also cause the issue.
  • You might be using a different OS.

Well, tox can help you with all of that and more. Because the Python tox framework makes it easy to:

  • Test multiple Python versions.
  • Test different dependency versions.
  • Run setup commands.
  • Isolate environment variables – as tox does not pass an environment variable to the testing.
  • Test against Windows, macOS, and Linux.

You see – this can highly improve the chances that your program works in different setups and help you understand what it takes to run it.

How does tox work?

I think the best way to think of tox is as follows.

  • It will generate a series of virtual environments.
  • Install the dependencies for each environment (defined in a config).
  • Run setup commands and commands.
  • Return the results from each run.

Step 2: How the Python tox framework works

The best way to learn something new is to see how it works in some real example. To do that let’s clone the following repository (here).

To get an introduction to the code (without tox), see the following tutorial. Also, learn about why you should use logging in Python and master the best practices.

The code consists of the following files.

  • requirements.txt
  • .gitignore
  • tox.ini
  • app/
  • app/
  • app/routers/
  • app/routers/
  • test/
  • test/

Most files are already described in the previous tutorial, which explains the REST API.

Here we will only focus on a few.

The files make the folders into packages (a package is a collection of modules (Python files), and the file tells Python it is a package). That is, we can use them correctly as packages in our imports. This makes things easier for us, as we can treat our project as a module we can import.

They are empty and do not contain any functionality.

Read more about them in Python docs.

test/ and test/

These files are testing files and are part of the tests we will make.

Notice, this is not a book on testing, which requires a full book by itself. But we have them here for demonstration purposes.

This makes it a package you can install.

Pip (“pip install -e .”) will use to install this module.

See Python docs for more details.


This is what we are looking for and the first fill we will look into.

Step 3: The tox.ini configuration file

The file tox.ini has the following content.

envlist = py310-{pytest,mypy}

deps =

description = Run pytest.
deps =
commands =

description = Run mypy
deps =
commands =
    mypy --install-types --non-interactive {toxinidir}/app

When we run tox (which we will), it will use the tox.ini file to figure out what to do.

The tox file structure from example

The tox.ini is made quite simple, but still a bit more complex than most examples with only one environment part. This file has 4 sections.

  • [tox] With a list of environments. Here we use the syntax py310-{pytest,mypy}, which is short for py310-pytest, py310-mypy. This tells tox to run tests in these two environments. The py310 part is saying it should be Python 3.10.
  • [testenv] This part has some general setups for the environments to be created. Here we have just some dependencies (deps), which will be installed with pip (pip install -rrequrements.txt).
  • [testenv:py310-pytest] This is the first test virtual environment. It has dependencies pytest and the ones defined in testenv ({[testenv]deps}). It will run the command pytest in this virtual environment.
  • [testenv:py310-mypy] This is the second virtual environment and is quite similar. It installs mypy and runs a mypy command.

I think tox can seem a bit more complex than it actually is. In this tutorial, we will first learn how to use it and how to make some modifications and add more test cases.

We will explore both and also which other tests could be done.

To run tox you need to install the Python tox framework first. You can install it as follows from a terminal.

pip install tox

Then you can run the tox as follows from your terminal.


Then it will run a bunch of things and it can take some seconds to finish.

It will create a new wrapper virtual environment, and install the requirements using the correct Python version. Then run the tests from pytest and mypy.

It should eventually end with something similar to this.

  py310-pytest: commands succeeded
  py310-mypy: commands succeeded
  congratulations :)

It should succeed.



But let’s try to break stuff and see what happens to learn how this works.

Wait a minute – did you notice?

This run created the following folders.

  • .mypy_cache A folder created by mypy
  • .tox A folder created by tox, containing the virtual environments.
  • fruit_service.egg-info Which is the package (module) of our Fruit Service.

You do not need to worry about the content of these folders.

Step 4: What does PyTest do in Python tox?

First of all, we will not become test masters and there are many other test frameworks. They all work in a similar manner with some differences (of course). pytest is one very commonly used, so knowing the basics will get you a long way.

Do you need to install pytest?

That is actually what tox does in the environment where it tests pytest.

description = Run pytest.
deps =
commands = pytest

You see, it has a dependency on pytest

If you want to you can install it in your environment as follows (but this is not needed):

pip install pytest

To run pytest, it simply writes and executes python -m pytest in the environment (commands).

To summarize.

  • python -m pytest runs the test files in folder test (actually all the files with a test in the filename).

Step 5: Explore the test files

As already mentioned – we will only learn the world of unit testing as part of the setup. We will not dive into making great tests.

The scope is not to master testing (or unit testing), it is a big subject. The purpose is to learn all the frameworks you need to understand as a Python developer. Now let’s explore the first test file

from fastapi.testclient import TestClient

from app.main import app

client = TestClient(app)

def test_get_main():
    response = client.get('/')
    assert response.status_code == 200
    assert response.json() == {'message': 'I am alive'}

You might already have guessed that we have structured the tests as follows.

  • for testing the app/ file.
  • for testing the app/routers/

The pytest framework will run the file and call all functions called test_something(), where something can be anything, as you see.

Before it initializes a TestClient(app). This is specific to FastAPI testing, which you can see in their official test guidelines.

client = TestClient(app)

Inside the first test (and only test function) test_get_main() it calls the default availability endpoint.

def test_get_main():
    response = client.get('/')
    assert response.status_code == 200
    assert response.json() == {'message': 'I am alive'}

This is stored in the response.

Then there are two assert statements. These are the actual tests.

The expression after the assert is a Boolean expression. You should design your tests to evaluate to True, if things happen as expected and False if not.

Hence, if all asserts are evaluated to True, then the test passes. If one or more evaluates to False, then the tests fail.

If you look in (where the endpoint it is testing is):

# Snapshot from app/

@app.get('/', status_code=HTTPStatus.OK)
async def root() -> Dict[str, str]:
    Endpoint for basic connectivity test.
    """'root called')
    return {'message': 'I am alive'}

Let’s change the message to be ‘I am still alive’ and re-run the test.

How to run only pytest with tox

To only run the pytest part of tox, then you can type.

tox -e py310-pytest

The status should be.

1 failed, 3 passed in 0.40s 

We broke one test.

Actually, if we look a bit more at the output we see what happened.

>       assert response.json() == {'message': 'I am alive'}
E       AssertionError: assert {'message': 'I still alive'} == {'message': 'I am alive'}
E         Differing items:
E         {'message': 'I still alive'} != {'message': 'I am alive'}
E         Use -v to get more diff

This is amazing. It says which assert fails.

And where the assert is.

test/ AssertionError

The purpose of this test is to ensure it returns the correct code (HTTP status code OK: 200) and the expected message – which is formatted in JSON.

You can see a list of HTTP status codes on Wikipedia.

As already mentioned, the purpose of this endpoint is to check if the service is running. The actual message is not important and could be different.

Why test the response message?

Often you will have a monitoring system to check if all the services are running. This service can use the message to check if it is alive – if there is no response, the service is down and we need to get it up and running. A simple test like this makes sense to have.

Said differently, the test ensures we do not publish breaking changes to the ecosystem our service lives in.

Let’s change it back.

Step 6: Let’s explore the test file

The test file test/ is a bit more involved.

import pytest
from fastapi.testclient import TestClient

from app.main import app

client = TestClient(app)

@pytest.mark.parametrize('test_order', ['banana', 'apple', 'pear'])
def test_post_order(test_order):
    response =
        params={'order': test_order}
    assert response.status_code == 200
    assert response.json() == {'order': test_order}

We see that the test function (test_post_order(test_order)) takes an argument. The arguments are given by the decorator on line 9:

@pytest.mark.parametrize('test_order', ['banana', 'apple', 'pear'])

What is a decorator?

A decorator is a function that takes another function and extends the behavior of the latter function without explicitly modifying it.

Here this decorator makes calls test_post_order with argument test_order 3 times:

  • test_order_order(test_order=’banan’)
  • test_order_order(test_order=’apple’)
  • test_order_order(test_order=’pear’)

This reduces the code you need to write. Without this decorator, you would need to make three test functions, one for each order (as you could not take the argument).

If it is confusing, just think that it calls your test function 3 times.

Could you make more calls?

Yes, just extend the list.

Remember in the output of tox pytest?

It writes 4 tests passed. 

And if we look closely.

test/ .                                                       [ 25%]
test/ ...                                                    [100%]

This is one dot (.) after the – which only does one test.

There are three dots (.) after – which has the 3 tests.

Also, it says 25% (1 out of 4 tests is 25%) after and 100% (4 out of 4 tests is 100%) after

If you want to learn more about unit testing with pytest there is a 400+ page pdf documentation guide on their official page: pytest Documentation.

On their official page they have quick guides and how-to guides: See here.

For testing, FastAPI please see their official testing guide: FastAPI testing.

Step 7: What does mypy do in Python tox?

And on the official documentation, it states (reference).

“Mypy is a static type checker for Python 3 and Python 2.7”

Why would you use it (source)?

  • Static typing can make programs easier to understand and maintain. Type declarations can serve as machine-checked documentation. This is important as code is typically read much more often than modified, and this is especially important for large and complex programs.
  • Static typing can help you find bugs earlier and with less testing and debugging. Especially in large and complex projects, this can be a major time-saver.
  • Static typing can help you find difficult-to-find bugs before your code goes into production. This can improve reliability and reduce the number of security issues.
  • Static typing makes it practical to build very useful development tools that can improve programming productivity or software quality, including IDEs with precise and reliable code completion, static analysis tools, etc.
  • You can get the benefits of both dynamic and static typing in a single language. Dynamic typing can be perfect for a small project or for writing the UI of your program, for example. As your program grows, you can adapt tricky application logic to static typing to help the maintenance.

Enough talking – what does it look like?

Adding mypy to the main file

Look in app/ (snapshot below).

async def root() -> Dict[str, str]:
    Endpoint for basic connectivity test.
    """'root called')
    return {'message': 'I am alive'}

It is the -> Dict[str, str] part.

What it tells the type checker is that the function root() should return a dictionary with string-to-string key-value pairs.

Why is that important?

Because now you can make static type checks.

Let’s make a simple example.

Assume someone is calling this function from somewhere else. This programmer knows that the function returns a dictionary with key values of type string-string.

Therefore, he feels safe to assume that.

Now you are told to make some changes in the function and you end up with the following.

async def root() -> Dict[str, str]:
    Endpoint for basic connectivity test.
    if random.uniform(0, 1) < 0.05:
        return None'root called')
    return {'message': 'I am alive'}

(notice you need to import random in the top).

Now the function might return None, which is not the type expected.

This will have consequences for the code of your fellow programmer. His code will fail whenever your function returns None.

That is one of the pain points with dynamic typing.

Luckily mypy will catch that (run tox -e py310-mypy in terminal).

It says.

app/ error: Incompatible return value type (got "None", expected "Dict[str, str]")

It got “None” and expected “Dict[str, str]”.

In we have an extra check.

async def order_call(order: str) -> Dict[str, str]:'Incoming order: {order}')
    return {'order': order}

The argument has the type str (order: str). This ensures that the caller needs to provide the argument order of type str.

It takes a bit of practice to understand it fully and for some types you need to import them, like the Dict.

import logging
from http import HTTPStatus
from typing import Dict

from fastapi import FastAPI

from .routers import order

logging.basicConfig(encoding='utf-8', level=logging.INFO,
                    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__file__)

app = FastAPI(
    title='Your Fruit Self Service',
    description='Order your fruits here',


@app.get('/', status_code=HTTPStatus.OK)
async def root() -> Dict[str, str]:
    Endpoint for basic connectivity test.
    """'root called')
    return {'message': 'I am alive'}

This was also done in

There are 250+ pages of documentation of mypy: mypy docs

My advice is.

  • Keep it simple – you will learn along the way. Start with what you understand.
  • Define types at variable declarations: a: int = 20
  • Define types of arguments to functions (see example above).
  • Define types of functions (see example above).

Step 8: Testing against multiple versions of Python with tox

Right now, our tox is set up to test against only one Python version, 3.10.

If you want, you could try many different versions. Let’s first try to add one more version. Let’s update tox.ini

envlist = py{39,310}-{pytest,mypy}

deps =

description = Run pytest.
deps =
commands =

description = Run mypy
deps =
commands =
    mypy --install-types --non-interactive {toxinidir}/app

As you see, we use this notation.

envlist = py{39,310}-{pytest,mypy}

This will create a list.

py39-pytest, py39-mypy, py310-pytest, py310-mypy

Also, they need a test. We do not have to set up different things for the 2 Python versions we test. Therefore, they can be done for one rule each.


And as follows.


When you run tox in the terminal you will see it takes a longer time and you end up with 4 tests.

Notice, that you need to have installed Python 3.9 for this to succeed. If you don’t have it installed it will fail. But if you do, then the output of tox should end as follows.

_______________________________________________________________________________________________________ summary ________________________________________________________________________________________________________
  py39-pytest: commands succeeded
  py39-mypy: commands succeeded
  py310-pytest: commands succeeded
  py310-mypy: commands succeeded
  congratulations :)

Whether you need to install Python 3.9 simply go here and download the 3.9 installers (down at specific releases) and it will do it for you.

Step 9: What else can you do with the Python tox framework?

There are other things that can be common to test for with tox.

Here we have a list of common frameworks. We will not go through them, but they are provided with links to the official documentation pages. Most have a decent get-started guide.

You should be able to add similar sections to your tox.ini file to create these tests if you like.

Here are some of the most common ones. Remember, it might not be necessary to add them all. I have introduced you to the two most important ones in most use cases.

  • bandit. Checks for common security issues
  • pylint. Checks for errors enforce coding standards and looks for code smells.
  • Flake8 Analyze and detect some errors.
  • pycodestyle. Checks against some of the style conventions in PEP 8.
  • pydocstyle. Checks compliance with Python docstring conventions.

Just to mention a few common ones.

Be sure to learn how to deploy your Python project to Docker or check the full career path to master web app development with Python.

Are You a Python Developer Ready to Land Your Dream Job?

Unlock the Key to Success with Cloud, Docker, Metrics, and Monitoring!

Master cloud computing, Docker, logging, Git & GitHub, metrics, and monitoring to accelerate your path to success as a Python developer.

Get job-ready skills without wasting time figuring it out on your own.

Deploy your Python applications effortlessly to the cloud, building scalable and resilient solutions.

Streamline your development workflow with Docker, eliminating compatibility issues and enabling seamless collaboration.

Optimize performance with metrics and monitoring, delivering exceptional user experiences and standing out to employers.

Don't settle for the ordinary. Stand out, impress employers, and supercharge your Python developer career. Buy this eBook now and unlock the power of the cloud, Docker, metrics, and monitoring.


Recent Posts

Build and Deploy an AI App

Build and Deploy an AI App with Python Flask, OpenAI API, and Google Cloud: In…

5 days ago

Building Python REST APIs with gcloud Serverless

Python REST APIs with gcloud Serverless In the fast-paced world of application development, building robust…

5 days ago

Accelerate Your Web App Development Journey with Python and Docker

App Development with Python using Docker Are you an aspiring app developer looking to level…

6 days ago

Data Science Course Made Easy: Unlocking the Path to Success

Why Value-driven Data Science is the Key to Your Success In the world of data…

2 weeks ago

15 Machine Learning Projects: From Beginner to Pro

Harnessing the Power of Project-Based Learning and Python for Machine Learning Mastery In today's data-driven…

2 weeks ago

Unlock the Power of Python: 17 Project-Based Lessons from Zero to Machine Learning

Is Python the right choice for Machine Learning? Should you learn Python for Machine Learning?…

2 weeks ago