Testing is extremely important. Without testing, you cannot be sure that your code is doing what you think. Testing is an integral part of software development, and where possible should be done while you are writing code, not after the code has been written.
No doubt so far, you have been manually checking that your code does the right thing. Perhaps you are running your code over a particular input file and making sure that you get a correct-looking plot out at the end? Or maybe running it with a few known inputs and checking that you got what you got last time? This is a start but how can you be sure that there's not a subtle bug that means that the output is incorrect? And if there is a problem, how will you be able to work out exactly which line of code it causing it?
In order to be confident that our code it giving a correct output, a test suite is useful which provides a set of known inputs and checks that the code matches a set of known, expected outputs. To make it easier to locate where a bug is occuring, it's a good idea to make each individual test run over as small an amount of code as possible so that if that test fails, you know where to look for the problem. In Python this "small unit of code" is usually a function.
Let's get started by making sure that our add_arrays
function matches the outputs we expect. As a reminder, this is what the file arrays.py
looks like:
"""
This module contains functions for manipulating and combining Python lists.
"""
def add_arrays(x, y):
"""
This function adds together each element of the two passed lists.
Args:
x (list): The first list to add
y (list): The second list to add
Returns:
list: the pairwise sums of ``x`` and ``y``.
Examples:
>>> add_arrays([1, 4, 5], [4, 3, 5])
[5, 7, 10]
"""
z = []
for x_, y_ in zip(x, y):
z.append(x_ + y_)
return z
Since the name of the module we want to test is arrays
, let's make a file called test_arrays.py
which contains the following:
from arrays import add_arrays
def test_add_arrays():
a = [1, 2, 3]
b = [4, 5, 6]
expect = [5, 7, 9]
output = add_arrays(a, b)
if output == expect:
print("OK")
else:
print("BROKEN")
test_add_arrays()
This script defines a function called test_add_arrays
which defines some known inputs (a
and b
) and a known, matching output (expect
). It passes them to the function add_arrays
and compares the output to expected
. It will either print OK
or BROKEN
depending on whether it's working or not. Finally, we explicitly call the test function.
When we run the script in the Terminal, we see it output OK
:
python test_arrays.py
Break the test by changing either a
, b
or expected
and rerun the test script. Make sure that it prints BROKEN
in this case. Change it back to a working state once you've done this.
The method used here works and runs the code correctly but it doesn't give very useful output. If we had five test functions in our file and three of them were failing we'd see something like:
OK
BROKEN
OK
BROKEN
BROKEN
We'd then have to cross-check back to our code to see which tests the BROKEN
s referred to.
To be able to automatically relate the output of the failing test to the place where your test failed, you can use an assert
statement.
An assert
statement is followed by something which is either truthy or falsy. A falsy expression is something which, when converted to a bool
gives False
. This includes empty lists, the number 0
and None
; everything else is considered truthy. The full list is available in the documentation.
If it is truthy then nothing happens but if it is falsy then an exception is raised:
assert 5 == 5
assert 5 == 6
We can now use this assert
statement in place of the if
/else
block:
from arrays import add_arrays
def test_add_arrays():
a = [1, 2, 3]
b = [4, 5, 6]
expect = [5, 7, 9]
output = add_arrays(a, b)
assert output == expect
test_add_arrays()
Now when we run the test script we get nothing printed on success:
python test_arrays.py
but on a failure we get an error printed like:
Traceback (most recent call last):
File "test_arrays.py", line 13, in <module>
test_add_arrays()
File "test_arrays.py", line 11, in test_add_arrays
assert output == expect
AssertionError
Which, like all exception messages gives us the location in the file at which the error occurred. This has the avantage that if we had many test functions being run it would tell us which one failed and on which line.
The downside of using an assert
like this is that as soon as one test fails, the whole script will halt and you'll only be informed of that one test.
There's a few things that we've been doing so far that could be improved. Firstly, for every test function that we write we then have to explicitly call it at the bottom of the test script like test_add_arrays()
. This is error-prone as we might write a test function and forget to call it and then we would miss any errors it would catch.
Secondly, we want nice, useful output from our test functions. Something better than the nothing/exception that a plain assert
gives us. It would be nice to get a green PASSED
for the good tests and a red FAILED
for the bad ones alongside the name of the test in question.
Finally, we want to make sure that all tests are run even if a test early in the process fails.
Luckily, there is tool called pytest which can give us all of these things. It will work on our test script almost exactly as written with only one change needed.
Remove the call to test_add_arrays()
on the last line of the file:
from arrays import add_arrays
def test_add_arrays():
a = [1, 2, 3]
b = [4, 5, 6]
expect = [5, 7, 9]
output = add_arrays(a, b)
assert output == expect
And in the Terminal, run pytest
:
pytest
Pytest will do two stages. First it will try to locate all the test functions that it can find and then it will run each of them in turn, reporting the results.
Here you can see that it's found that the file test_arrays.py
contains a single test function. The green dot next to the name of the file signifies the passing test. It then prints a summary at the end saying "1 passed".
The way that pytest
works is that it looks for files which are called test_*.py
or *_test.py
and look inside those for functions whose names begin with test
. It will then run those functions one at a time, reporting the results of each in turn.
To see what it looks like when you have a failing test, let's deliberately break the test code by giving a wrong expected result:
from arrays import add_arrays
def test_add_arrays():
a = [1, 2, 3]
b = [4, 5, 6]
expect = [5, 7, 999] # Changed this to break the test
output = add_arrays(a, b)
assert output == expect
When we run this test with pytest
it should tell us that the test is indeed failing:
pytest
The output from this is better than we saw with the plain assert
. It's printing the full context of the contents of the test function with the line where the assert
is failing being marked with a >
. It then gives an expanded explanation of why the assert failed. Before we just got AssertionError
but now it prints out the contents of output
and expect
and tells us that at index 2 of the list it's finding a 9
where we told it to expect a 999
.
Before continuing, make sure that you change the file back to its previous contents by changing that 999
back to a 9
.
Make sure you can run the test as it is written here. See what happens when you make the test fail.
Having a single test for a function is already infinitely better than having none, but one test only gives you so much confidence. The real power of a test suite is being able to test your functions under lots of different conditions.
Lets add a second test to check a different set of inputs and outputs to the add_arrays
function and check that it passes:
from arrays import add_arrays
def test_add_arrays1():
a = [1, 2, 3]
b = [4, 5, 6]
expect = [5, 7, 9]
output = add_arrays(a, b)
assert output == expect
def test_add_arrays2():
a = [-1, -5, -3]
b = [-4, -3, 0]
expect = [-5, -8, -3]
output = add_arrays(a, b)
assert output == expect
When we run pytest
we can optionally pass the -v
flag which puts it in verbose mode. This will print out the tests being run, one per line which I find a more useful view most of the time:
pytest -v
We see both tests being run and passing. This will work well but we've had to repeat ourselves almost entirely in each test function. The only difference between the two functions is the inputs and outputs under test. Usually in this case in a normal Python function you would take these things as arguments and we can do the same thing here.
The actual logic of the function is the following:
def test_add_arrays(a, b, expect):
output = add_arrays(a, b)
assert output == expect
We then just need a way of passing the data we want to check into this function. Since we're not explicitly calling this function ourselves, we need a way to tell pytest that it should pass in certain arguments. For this, pytest provides a feature called parametrisation. We label our function with a decoration which allows pytest to run it mutliple times with different data.
To use this feature we must import the pytest
module and use the pytest.mark.parametrize
decorator like the following:
import pytest
from arrays import add_arrays
@pytest.mark.parametrize("a, b, expect", [
([1, 2, 3], [4, 5, 6], [5, 7, 9]),
([-1, -5, -3], [-4, -3, 0], [-5, -8, -3]),
])
def test_add_arrays(a, b, expect):
output = add_arrays(a, b)
assert output == expect
The parametrize
decorator takes two arguments:
"a, b, expect"
)In this case, the test will be run twice. Once with each of the following values:
a=[1, 2, 3]
, b=[4, 5, 6]
, expect=[5, 7, 9]
a=[-1, -5, -3]
, b=[-4, -3, 0]
, expect=[-5, -8, -3]
pytest -v
Running pytest
we see that both tests have the same name (test_arrays.py::test_add_arrays
) but each parametrisation is differentiated with some square brackets.
The interface of a function is made up of the parameters it expects and the values that it returns. If a user of a function knows these things then they are able to use it correctly. This is why we make sure to include this information in the docstring for all our functions.
The other thing that is part of the interface of a function is any exceptions that are raised by it. If you need a refresher on exceptions and error handling in Python, take a look at the chapter on it in the Intermediate Python course.
To add explicit error handling to our function we need to do two things:
raise
statement:if len(x) != len(y):
raise ValueError("Both arrays must have the same length.")
Raises:
ValueError: If the length of the lists ``x`` and ``y`` are different.
Let's add these to arrays.py
:
"""
This module contains functions for manipulating and combining Python lists.
"""
def add_arrays(x, y):
"""
This function adds together each element of the two passed lists.
Args:
x (list): The first list to add
y (list): The second list to add
Returns:
list: the pairwise sums of ``x`` and ``y``.
Raises:
ValueError: If the length of the lists ``x`` and ``y`` are different.
Examples:
>>> add_arrays([1, 4, 5], [4, 3, 5])
[5, 7, 10]
"""
if len(x) != len(y):
raise ValueError("Both arrays must have the same length.")
z = []
for x_, y_ in zip(x, y):
z.append(x_ + y_)
return z
We can then test that the function correctly raises the exception when passed appropriate data. Inside a pytest function we can require that a specific exception is raised by using pytest.raises
in a with
block. pytest.raises
takes as an argument the type of an exception and if the block ends without that exception having been rasied, will fail the test.
It may seem strange that we're testing-for and requiring that the function raises an error but it's important that if we've told our users that the code will produce a certain error in specific circumstances that it does indeed do as we promise.
In our code we add a new test called test_add_arrays_error
which does the check we require:
import pytest
from arrays import add_arrays
@pytest.mark.parametrize("a, b, expect", [
([1, 2, 3], [4, 5, 6], [5, 7, 9]),
([-1, -5, -3], [-4, -3, 0], [-5, -8, -3]),
])
def test_add_arrays(a, b, expect):
output = add_arrays(a, b)
assert output == expect
def test_add_arrays_error():
a = [1, 2, 3]
b = [4, 5]
with pytest.raises(ValueError):
output = add_arrays(a, b)
pytest -v
If you remember from when we were documenting our add_arrays
function, we had a small section which gave the reader an example of how to use the function:
Examples:
>>> add_arrays([1, 4, 5], [4, 3, 5])
[5, 7, 10]
Since this is valid Python code, we can ask pytest to run this code and check that the output we claimed would be returned is correct. If we pass --doctest-modules
to the pytest
command, it will search .py
files for docstrings with example blocks and run them:
pytest -v --doctest-modules
We see here the arrays.py::arrays.add_arrays
test which has passed. If you get a warning about deprecation then ignore it, this is from a third-party module which is leaking through.
Doctests are a really valuable thing to have in your test suite as they ensure that any examples that you are giving work as expected. It's not uncommon for the code to change and for the documentation to be left behind and being able to automatically check all your examples avoids this.
See what happens when you break your doctest and run pytest
again.
As you increase the number of tests you will come across situations where you only want to run a particular test. To do this, you follow pass the name of the test, as printed by pytest -v
as an argument to pytest
. So, if we want to run all tests in test_arrays.py
we do:
pytest -v test_arrays.py
Or, if we want to specifically run the test_add_arrays
test:
pytest -v test_arrays.py::test_add_arrays
Or, if we want to run one test specifically:
pytest -v "test_arrays.py::test_add_arrays[a0-b0-expect0]"
Take a look at the output of pytest -h
for more options. For example, you can tell pytest
to only run the tests that failed on the last run with pytest --last-failed
.
When developing code I recommend starting your day by running the test suite. This gives you confidence in the code before you start playing around with it. Then after each set of changes, run the tests again to make sure that you have not broken anything.
As you add new feature and functions to your code, add tests for them straight away. Doing this while the code logic is fresh in your mind will make the test writing much easier. Even better is to write the tests for the code directly alongside the new function. This gives you a structured way of checking your function while you work on it. You're likely doing manual testing of your code as you write it anyway so why not automate it so that once you're finished you've improved your test suite for free!
Some people go a step further and follow a process called test-driven development where the first thing you do is write the docstring for the function, describing what the function will take as arguments and what it will return. Once you know that, you can write a test which will, in principle, pass if the function does as described. Only then will they go ahead and write the body of the function and once the test passes, their job is done.
These techniques are part of a spectrum so find the place on it which makes you the most productive in the long-term. When I started programming I never wrote tests but as I've progressed I've found that they make writing correct code much easier and I couldn't write software without them. You will likely find a progression towards a more structured approach over time.
You don't need to worry over covering every single line of code with tests. If you have zero tests then you have effectively zero confidence that your code works. As soon as you add a single one, your confidence starts growing. The more you add, the safer you are. There's no magic number of tests.
We will not cover it in this course but once you have a good test suite and if your source code is hosted on a public Git website then you can make it so that the tests of your code are automatically run on every change. As long as you commit your test_*.py
to Git files then with a few lines of config you are able to make it run your tests for you. This is especially useful once you are collaborating with other people as the atuomated tests will give you confidence that the new feature your colleage is adding has not broken the code you wrote last week for example.