Test design

Questions

  • How can different types of functions and classes be tested?

  • How can the integrity of a complete program be monitored over time?

  • How can functions that involve random numbers be tested?

In this episode we will consider how functions and programs can be tested in programs developed in different programming languages.

Exercise instructions

For the instructor

  • First motivate and give a quick tour of all exercises below (10 minutes).

  • Emphasize that the focus of this episode is design. It is OK to only discuss in groups and not write code.

During exercise session

  • Choose the exercise which interests you most. There are many more exercises than we would have time for.

  • Discuss what testing framework can be used to implement the test.

  • Keep notes, questions, and answers in the collaborative document.

Once we return to stream

  • Discussion about experiences learned.

Language-specific instructions

The suggested solutions below use pytest. Further information can be found in the Quick Reference.

Pure and impure functions

Start by discussing how you would design tests for the following five functions, and then try to write the tests. Also discuss why some are easier to test than others.

Design-1: Design a test for a function that receives a number and returns a number

def factorial(n):
    """
    Computes the factorial of n.
    """
    if n < 0:
        raise ValueError('received negative input')
    result = 1
    for i in range(1, n + 1):
        result *= i
    return result

Discussion point: The factorial grows very rapidly. What happens if you pass a large number as argument to the function?

Design-2: Design a test for a function that receives two strings and returns a number

def count_word_occurrence_in_string(text, word):
    """
    Counts how often word appears in text.
    Example: if text is "one two one two three four"
             and word is "one", then this function returns 2
    """
    words = text.split()
    return words.count(word)

Design-3: Design a test for a function which reads a file and returns a number

def count_word_occurrence_in_file(file_name, word):
    """
    Counts how often word appears in file file_name.
    Example: if file contains "one two one two three four"
             and word is "one", then this function returns 2
    """
    count = 0
    with open(file_name, 'r') as f:
        for line in f:
            words = line.split()
            count += words.count(word)
    return count

Design-4: Design a test for a function with an external dependency

This one is not easy to test because the function has an external dependency.

def check_reactor_temperature(temperature_celsius):       """
    Checks whether temperature is above max_temperature
    and returns a status.
    """
    from reactor import max_temperature
    if temperature_celsius > max_temperature:
        status = 1
    else:
        status = 0
    return status

Design-5: Design a test for a method of a mutable class

class Pet:
    def __init__(self, name):
        self.name = name
        self.hunger = 0
    def go_for_a_walk(self):  # <-- how would you test this function?
        self.hunger += 1

Test-driven development

Design-6: Experience test-driven development

Write a test before writing the function! You can decide yourself what your unwritten function should do, but as a suggestion it can be based on FizzBuzz - i.e. a function that:

  • takes an integer argument

  • for arguments that are multiples of three, returns “Fizz”

  • for arguments that are multiples of five, returns “Buzz”

  • for arguments that are multiples of both three and five, returns “FizzBuzz”

  • fails in case of non-integer arguments or integer arguments 0 or negative

  • otherwise returns the integer itself

When writing the tests, consider the different ways that the function could and should fail.

After you have written the tests, implement the function and run the tests until they pass.

Testing randomness

How would you test functions which generate random numbers according to some distribution/statistics?

Functions and modules which contain randomness are more difficult to test than pure deterministic functions, but many strategies exist:

  • For unit tests we can use fixed random seeds.

  • Try to test whether your results follow the expected distribution/statistics.

  • When you verify your code “by eye”, what are you looking at? Now try to express that in a script.

Design-7: Write two different types of tests for randomness

Consider the code below which simulates playing Yahtzee by using random numbers. How would you go about testing it?

Try to write two types of tests:

  • a unit test for the roll_dice function. Since it uses random numbers, you will need to set the random seed, pre-calculate what sequence of dice throws you get with that seed, and use that in your test.

  • a test of the yahtzee function which considers the statistical probability of obtaining a “Yahtzee” (5 dice with the same value after three throws), which is around 4.6%. This test will be an integration test since it tests multiple functions including the random number generator itself.

import random
from collections import Counter


def roll_dice(num_dice):
    return [random.choice([1, 2, 3, 4, 5, 6]) for _ in range(num_dice)]


def yahtzee():
    """
    Play yahtzee with 5 6-sided dice and 3 throws.
    Collect as many of the same dice side as possible.
    Returns the number of same sides.
    """

    # first throw
    result = roll_dice(5)
    most_common_side, how_often = Counter(result).most_common(1)[0]

    # we keep the most common side
    target_side = most_common_side
    num_same_sides = how_often
    if num_same_sides == 5:
        return 5

    # second and third throw
    for _ in [2, 3]:
        throw = roll_dice(5 - num_same_sides)
        num_same_sides += Counter(throw)[target_side]
        if num_same_sides == 5:
            return 5

    return num_same_sides


if __name__ == "__main__":
    num_games = 100

    winning_games = list(
        filter(
            lambda x: x == 5,
            [yahtzee() for _ in range(num_games)],
        )
    )

    print(f"out of the {num_games} games, {len(winning_games)} got a yahtzee!")

Designing an end-to-end test

In this exercise we will practice designing an end-to-end test. In an end-to-end test (or integration test), the unit is the entire program. We typically feed the program with some well defined input and verify that it still produces the expected output by comparing it to some reference.

Design-8: Design (but not write) an end-to-end test for the uniq program

To have a tangible example, let us consider the uniq command. This command can read a file or an input stream and remove consecutive repetition. The program behind uniq has been written by somebody else, it probably contains some functions, but we will not look into it but regard it as “black box”.

If we have a file called repetitive-text.txt containing:

(all together now) all together now
(all together now) all together now
(all together now) all together now
(all together now) all together now
(all together now) all together now
another line
another line
another line
another line
intermission
more repetition
more repetition
more repetition
more repetition
more repetition
(all together now) all together now
(all together now) all together now

… then feeding this input file to uniq like this:

$ uniq < repetitive-text.txt

… will produce the following output with repetitions removed:

(all together now) all together now
another line
intermission
more repetition
(all together now) all together now

How would you write an end-to-end test for uniq?

Design-9: More end-to-end testing

  • Now imagine a code which reads numbers and produces some (floating point) numbers. How would you test that?

  • How would you test a code end-to-end which produces images?

Design-10: Create an actual end-to-end test

Often, you can include tests that run your whole workflow or program. For example, you might include sample data and check the output against what you expect. (including sample data is a great idea anyway, so this helps a lot!)

We’ll use the word-count example repository https://github.com/coderefinery/word-count.

As a reminder, you can run the script like this to get some output, which prints to standard output (the terminal):

$ python3 statistics/count.py data/abyss.txt

Your goal is to make a test that can run this and let you know if it’s successful or not. You could use Python, or you could use shell scripting. You can test if these two lines are in the output: the 4044 and and 2807.

Python hint: subprocess.check_output will run a command and return its output as a string.

Bash hint: COMMAND | grep "PATTERN" (“pipe to grep”) will be true if the pattern is in the command.

Keypoints

  • Pure functions (these are functions without side-effects) are easiest to test. And also easiest to reuse in another code project since they don’t depend on any side-effects.

  • Classes can be tested but it’s somewhat more elaborate.