Motivation

Objectives

Appreciate the importance of testing software
Understand various benefits of testing

Untested software can be compared to uncalibrated detectors

“Before relying on a new experimental device, an experimental scientist always establishes its accuracy. A new detector is calibrated when the scientist observes its responses to known input signals. The results of this calibration are compared against the expected response.”

[From Testing and Continuous Integration with Python, created by K. Huff]

With testing, simulations and analysis using software can be held to the same standards as experimental measurement devices!

What can go wrong when research software has bugs? Look no further:

Testing in a nutshell

In software tests, expected results are compared with observed results in order to establish accuracy. Why are we not comparing directly all digits with the expected result?:

def fahrenheit_to_celsius(temp_f):
    """Converts temperature in Fahrenheit
    to Celsius.
    """
    temp_c = (temp_f - 32.0) * (5.0/9.0)
    return temp_c

# This is the test function: `assert` raises an error if something
# is wrong.
def test_fahrenheit_to_celsius():
    temp_c = fahrenheit_to_celsius(temp_f=100.0)
    expected_result = 37.777777
    assert abs(temp_c - expected_result) < 1.0e-6
    

#include <cmath>  // std::abs
#include <cstdlib>
#include <iostream>

using namespace std;

/* Converts temperature in Fahrenheit to Celsius. */
double fahrenheit_to_celsius(double temp_f) {
  auto temp_c = (temp_f - 32.0) * (5.0 / 9.0);
  return temp_c;
}

/* This is the test function: `throws` raises an error if something is wrong. */
void test_fahrenheit_to_celsius() {
  auto temp_c = fahrenheit_to_celsius(100.0);
  auto expected_result = 37.777777;
  try {
    if (abs(temp_c - expected_result) > 1.0e-6) throw "Error";
  } catch (char const* err) {
    cout << err;
  }
}

int main() {
  cout << fahrenheit_to_celsius(20);
  test_fahrenheit_to_celsius();
  return EXIT_SUCCESS;
}

# Converts temperature in Fahrenheit to Celsius.
fahrenheit_to_celsius <- function(temp_f)
{
  temp_c <- (temp_f - 32.0) * (5.0/9.0)
  temp_c
}

# This is the test function: `assertive::is_true` raises an error if something
# is wrong.
test_fahrenheit_to_celsius <- function()
{
  temp_c <- fahrenheit_to_celsius(temp_f = 100.0)
  expected_result <- 37.777777
  assertive::is_true(abs(temp_c - expected_result) < 1.0e-6)
}

using Test

"""
    fahrenheit_to_celsius(temp_f::Float)

Converts temperature in Fahrenheit to Celsius.
"""
function fahrenheit_to_celsius(temp_f)
    temp_c = (temp_f - 32.0) * (5.0/9.0)
    return temp_c
end


# This is the test section
@testset "Test fahrenheit_to_celsius" begin
    temp_c = fahrenheit_to_celsius(100.0)
    expected_result = 37.777777
    @test abs(temp_c - expected_result) < 1.0e-6
end

program temperature_conversion

implicit none
call test_fahrenheit_to_celsius()

contains

function fahrenheit_to_celsius(temp_f) result(temp_c)
   implicit none
   real temp_f
   real temp_c
   temp_c = (temp_f - 32.0) * (5.0/9.0)
end function fahrenheit_to_celsius

subroutine test_fahrenheit_to_celsius()
   implicit none
   real temp_c
   real expected_result
   temp_c = fahrenheit_to_celsius(100.0)
   expected_result = 37.777777
   if( abs(temp_c - expected_result) > 1.0e-6) then
      write(*,*) 'Error'
   else
      write(*,*) 'Pass'
   end if
end subroutine test_fahrenheit_to_celsius

end program temperature_conversion

Or you can test whole programs:

$ python3 run-test.py

running: sample_data/set1.csv --output=tests/set1.txt
CORRECT

What can tests help you do?

Preserving expected functionality

Check old things when you add new ones

Help users of your code

Verify it’s installed correctly and works.
See examples of what it should do.

Help other developers modify it

Change things with confidence that nothing is breaking.
Warning if documentation/examples go out of date.

Manage complexity

If code is easy to test, it’s probably easier to maintain.
The next lesson Modular code development demonstrates this.

Discussion: When is it OK not to add tests?

Discussion: When is it OK not to add tests?

Vote in the notes and we’ll discuss soon. It is always a balance: there is no “always”/”never”.

Jupyter or R Markdown notebook which produces a plot and you know by looking at the plot whether it worked?
A short, “obviously correct” Python or R script which you never intend to reuse?
A simple short, “obviously correct” shell script?
Can you give other examples?

Discussion: What’s easy and hard to test?

Discussion: Testing in practice

Use the collaborative notes to answer these questions:

Give examples of things (from your work) that are easy to test.
Give examples of things (from your work) that are hard to test.

Types of tests

Test functions one at a time - Unit tests
Test how parts work together - Integration tests
Test the whole thing running - End-to-end tests
- For example, running on sample data.
Check same results as before - Regression tests
Write test first (the output), then write code to make test pass - Test-driven development
GitHub or GitLab runs tests automatically - Continuous integration
Report that tells you which lines were/were not run by tests - Code coverage
Framework that runs test for you - Testing framework
- See Quick Reference for some examples.

What should you do?

Not every code needs perfect test coverage.
If code is interactive-only (Jupyter Notebook), it’s usually hard to test.
- But also hard to run: the next lesson will discuss!
At least end-to-end is often easy to add.
Add tests of tricky functions.
- If you’d have to run it over and over to test while writing, why not make it a property test?
It’s easy to have Gitlab/Github run the tests.
- It’s nice to push without thinking, and the system tells you when it’s broke.
Learning how to test well make the rest of your code better, too.

Where to start

A simple script or notebook probably does not need an automated test.

If you have nothing yet

Start with an end-to-end test.
Describe in words how you check whether the code still works.
Translate the words into a script.
Run the script automatically on every code change.

If you want to start with unit-testing

You want to rewrite a function? Start adding a unit test right there first.