Test design
Questions
How can different types of functions and classes be tested?
How can the integrity of a complete program be monitored over time?
How can functions that involve random numbers be tested?
In this episode we will consider how functions and programs can be tested in programs developed in different programming languages.
Exercise instructions
For the instructor
First motivate and give a quick tour of all exercises below (10 minutes).
Emphasize that the focus of this episode is design. It is OK to only discuss in groups and not write code.
During exercise session
Choose the exercise which interests you most. There are many more exercises than we would have time for.
Discuss what testing framework can be used to implement the test.
Keep notes, questions, and answers in the collaborative document.
Once we return to stream
Discussion about experiences learned.
Language-specific instructions
The suggested solutions below use pytest. Further information can be found in the Quick Reference.
The suggested solutions below use Catch2. Further information can be found in the Quick Reference.
The suggested solutions below use testthat. Further information can be found in the Quick Reference.
The suggested solutions below use Test. Further information can be found in the Quick Reference.
The suggested solutions below use pFUnit. Further information on how to install pFUnit and set up tests can be found in the Quick Reference.
Pure and impure functions
Start by discussing how you would design tests for the following five functions, and then try to write the tests. Also discuss why some are easier to test than others.
Design-1: Design a test for a function that receives a number and returns a number
def factorial(n):
"""
Computes the factorial of n.
"""
if n < 0:
raise ValueError('received negative input')
result = 1
for i in range(1, n + 1):
result *= i
return result
/* Computes the factorial of n recursively. */
constexpr unsigned int factorial(unsigned int n) {
return (n <= 1) ? 1 : (n * factorial(n - 1));
}
#' Computes the factorial of n
#'
#' @param n The number to compute the factorial of.
#' @return The factorial of n
factorial <- function(n) {
if (n < 0)
stop('received negative input')
if (n == 0)
return(1)
result <- 1
for (i in 1:n)
result <- result * i
result
}
"""
factorial(n::Int)
Compute the factorial of n.
"""
function factorial(n::Int)
if n < 0
throw(DomainError("n must be non-negative"))
end
result = 1
for i in 1:n
result *= i
end
return result
end
module factorial_mod
contains
! computes the factorial of n
integer function factorial(n)
implicit none
integer, intent(in) :: n
integer r
integer i
if(n < 0) then
write(*,*) 'Received negative input'
stop
end if
r = 1
do i = 1,n
r = r*i
end do
factorial=r
end function factorial
end module factorial_mod
Discussion point: The factorial grows very rapidly. What happens if you pass a large number as argument to the function?
Solution
This is a pure function so is easy to test: inputs go to outputs. For example, start with the below, then think of some what extreme cases/boundary cases there might be. This example shows all of the tests as one function, but you might want to make each test function more fine-grained and test only one concept.
import pytest
def test_factorial():
assert factorial(0) == 1
assert factorial(1) == 1
assert factorial(2) == 2
#include <catch2/catch.hpp>
#include "factorial.hpp"
TEST_CASE("Compute the factorial", "[factorial]") {
REQUIRE(factorial(0) == 1);
REQUIRE(factorial(1) == 1);
REQUIRE(factorial(2) == 2);
REQUIRE(factorial(3) == 6);
}
test_that("Test factorial", {
expect_equal(factorial(0), 1)
expect_equal(factorial(1), 1)
expect_equal(factorial(2), 2)
expect_equal(factorial(3), 6)
# also try negatives (check that it raises an error), non-integers, etc.
# Raise an error if factorial does *not* raise an error:
expect_error(factorial(-1))
})
@testset "Test factorial function" begin
@test_throws DomainError factorial(-1)
@test factorial(3) == 6
end
@test
subroutine test_factorial()
use factorial_mod
use funit
@assertEqual(120, factorial(5), 'factorial(5)')
end subroutine test_factorial
Notes on the discussion point: Programming languages differ in the way they deal with integer
overflow. Python automatically converts to the necessary long
type, in Julia you would
observe a “wrap-around”, in C/C++ you get undefined behaviour for signed integers.
Testing for overflow likewise depends on the language.
Design-2: Design a test for a function that receives two strings and returns a number
def count_word_occurrence_in_string(text, word):
"""
Counts how often word appears in text.
Example: if text is "one two one two three four"
and word is "one", then this function returns 2
"""
words = text.split()
return words.count(word)
#include <string>
/* Counts how often word appears in text.
* Example: if text is "one two one two three four"
* and word is "one", then this function returns 2
*/
int count_word_occurrence_in_string(const std::string& text, const std::string& word) {
auto word_count = 0;
auto count = 0;
for (const auto ch : text) {
if (ch == word[word_count]) ++word_count;
if (word[word_count] == '\0') {
word_count = 0;
++count;
}
}
return count;
}
#' Counts how often a given word appears in text
#'
#' @param text The text to search in.
#' @param word The word to search for.
#' @return The number of times the word occurs in the text.
count_word_occurrence_in_string <- function(text, word) {
words <- strsplit(text, ' ')[[1]]
sum(words == word)
}
"""
count_word_occurrence_in_string(text::String, word::String)
Count how often word appears in text.
Example: if `text` is "one two one two three four"
and `word` is "one", then this function returns 2
"""
function count_word_occurrence_in_string(text::String, word::String)
return count(word, text)
end
! missing - please submit an example
Solution
This is again a pure function but uses strings. Use a similar strategy to the above.
def test_count_word_occurrence_in_string():
assert count_word_occurrence_in_string('AAA BBB', 'AAA') == 1
assert count_word_occurrence_in_string('AAA AAA', 'AAA') == 2
# What does this last test tell us?
assert count_word_occurrence_in_string('AAAAA', 'AAA') == 1
#include <catch2/catch.hpp>
#include "word_count.hpp"
TEST_CASE("Count occurrences of substring in string", "[count_word_occurrence_in_string]") {
REQUIRE(count_word_occurrence_in_string("AAA BBB", "AAA") == 1);
REQUIRE(count_word_occurrence_in_string("AAA AAA", "AAA") == 2);
REQUIRE(count_word_occurrence_in_string("AAAA", "AAA") == 0);
test_that("Test count word occurrence in string", {
expect_equal(count_word_occurrence_in_string("AAA BBB", "AAA"), 1)
expect_equal(count_word_occurrence_in_string("AAA AAA", "AAA"), 2)
expect_equal(count_word_occurrence_in_string("AAAAA", "AAA"), 0)
})
@testset "Test count word occurrence in string" begin
@test count_word_occurrence_in_string("AAA BBB", "AAA") == 1
@test count_word_occurrence_in_string("AAA AAA", "AAA") == 2
@test count_word_occurrence_in_string("AAAAA", "AAA") == 1
end
! missing - please submit an example
Design-3: Design a test for a function which reads a file and returns a number
def count_word_occurrence_in_file(file_name, word):
"""
Counts how often word appears in file file_name.
Example: if file contains "one two one two three four"
and word is "one", then this function returns 2
"""
count = 0
with open(file_name, 'r') as f:
for line in f:
words = line.split()
count += words.count(word)
return count
#include <fstream>
#include <streambuf>
#include <string>
/* Counts how often word appears in file fname.
* Example: if file contains "one two one two three four"
* and word is "one", then this function returns 2
*/
int count_word_occurrence_in_file(std::string fname, std::string word) {
std::ifstream fh(fname);
std::string text((std::istreambuf_iterator<char>(fh)),
std::istreambuf_iterator<char>());
auto word_count = 0lu; // will be used for indexing and therefore it has to be *long unsigned* int for the safe conversion to 'std::__cxx11::basic_string<char>::size_type'.
auto count = 0;
for (const auto ch : text) {
if (ch == word[word_count]) ++word_count;
if (word[word_count] == '\0') {
word_count = 0;
++count;
}
}
return count;
}
#' Counts how often a given word appears in a file.
#'
#' @param file_name The name of the file to search in.
#' @param word The word to search for in the file.
#' @return The number of times the word appeared in the file.
count_word_occurrence_in_file <- function(file_name, word) {
count <- 0
for (line in readLines(file_name)) {
words <- strsplit(line, ' ')[[1]]
count <- count + sum(words == word)
}
count
}
"""
count_word_occurrence_in_file(file_name::String, word::String)
Counts how often word appears in file file_name.
Example: if file contains "one two one two three four"
And word is "one", then this function returns 2
"""
function count_word_occurrence_in_file(file_name::String, word::String)
open(file_name, "r") do file
lines = readlines(file)
return count(word, join(lines))
end
end
! missing - please submit an example
Solution
In this example we test a function which is not pure, because the output depends on the value of a file. We can generate a temporary file for testing and remove it afterwards. Even better could be to split file reading from the calculation, so that testing the calculation part becomes easy (see above).
import tempfile
import os
def test_count_word_occurrence_in_file():
_, temporary_file_name = tempfile.mkstemp()
with open(temporary_file_name, 'w') as f:
f.write("one two one two three four")
count = count_word_occurrence_in_file(temporary_file_name, "one")
assert count == 2
os.remove(temporary_file_name)
#include <filesystem> // for temp_directory_path(), requires C++17.
#include <fstream>
#include <catch2/catch.hpp>
#include "word_count.hpp"
TEST_CASE("Count occurrences of substring in file", "[count_word_occurrence_in_file]") {
namespace fs = std::filesystem;
auto tmp_dir{ fs::temp_directory_path() };
auto fpath{ fs::temp_directory_path() / "temp_file" };
std::ofstream s(fpath, std::ios::out | std::ios::trunc);
s << "one two one two three four" << std::endl;
s.close();
REQUIRE(count_word_occurrence_in_file(fname, "one") == 2);
REQUIRE(count_word_occurrence_in_file(fname, "three") == 1);
REQUIRE(count_word_occurrence_in_file(fname, "six") == 0);
fs::remove( fpath );
}
test_that("Test count word occurrence in file", {
fname <- tempfile()
write("one two one two three four", fname)
expect_equal(count_word_occurrence_in_file(fname, "one"), 2)
expect_equal(count_word_occurrence_in_file(fname, "three"), 1)
expect_equal(count_word_occurrence_in_file(fname, "six"), 0)
unlink(fname)
})
@testset "Test count word occurrence in file" begin
msg = "one two one two three four"
(fname, testio) = mktemp()
println(testio, msg)
close(testio)
@test count_word_occurrence_in_file(fname, "one") == 2
@test count_word_occurrence_in_file(fname, "three") == 1
@test count_word_occurrence_in_file(fname, "six") == 0
end
! missing - please submit an example
Design-4: Design a test for a function with an external dependency
This one is not easy to test because the function has an external dependency.
def check_reactor_temperature(temperature_celsius): """
Checks whether temperature is above max_temperature
and returns a status.
"""
from reactor import max_temperature
if temperature_celsius > max_temperature:
status = 1
else:
status = 0
return status
#include "constants.hpp"
/* ^--- Defines the max_temperature constant as:
* namespace constants {
* constexpr double max_temperature = 100.0;
* }
*/
enum class ReactorState : int { FINE, CRITICAL };
/* Checks whether temperature is above max_temperature and returns a status. */
ReactorState check_reactor_temperature(double temperature_celsius) {
return temperature_celsius > constants::max_temperature
? ReactorState::CRITICAL
: ReactorState::FINE;
}
# reactor <- namespace::makeNamespace("reactor")
# assign("max_temperature", 100, env = reactor)
# namespaceExport(reactor, "max_temperature")
#' Checks whether the temperature is above max_temperature
#' and returns the status.
#'
#' @param temperature_celsius The temperature of the core
#' @return 1 if the temperature is in range, otherwise 0
check_reactor_temperature <- function(temperature_celsius) {
if (temperature_celsius > reactor::max_temperature)
status <- 1
else
status <- 0
status
}
# Importing modules inside functions is not allowed in Julia and must be done at top-level
using reactor: max_temperature
"""
check_reactor_temperature(temperature_celsius)
Checks whether temperature is above max_temperature
and returns a status.
"""
function check_reactor_temperature(temperature_celsius)
if temperature_celsius > max_temperature
status = 1
else
status = 0
end
end
! missing - please submit an example
Solution
This function depends on the value of
reactor.max_temperature
so the function is not pure, so testing gets
harder. You could use monkey patching to override the value of
max_temperature
, and test it with different values. Monkey
patching is the concept
of artificially changing some other value.
A better solution would probably be to rewrite the function.
def test_set_temp(monkeypatch):
monkeypatch.setattr(reactor, "max_temperature", 100)
assert check_reactor_temperature(99) == 0
assert check_reactor_temperature(100) == 0 # boundary cases easily go wrong
assert check_reactor_temperature(101) == 1
// Changing variables included from headers (monkey patching) is not possible in C++.
#include <catch2/catch.hpp>
#include "reactor.hpp"
TEST_CASE("Check reactor state", "[reactor_state]") {
REQUIRE(check_reactor_temperature(99) == ReactorState::FINE);
REQUIRE(check_reactor_temperature(100) == ReactorState::FINE);
REQUIRE(check_reactor_temperature(101) == ReactorState::CRITICAL);
# Changing variables imported from modules (monkey patching) is not possible in R.
# To be able to test this function properly it needs to be made pure:
check_reactor_temperature <- function(max_temperature, temperature_celsius) {
if (temperature_celsius > max_temperature)
status <- 1
else
status <- 0
status
}
test_that("Test reactor temperature", {
expect_equal(check_reactor_temperature(100, 99), 0)
expect_equal(check_reactor_temperature(100, 100), 0) # boundary cases easily go wrong
expect_equal(check_reactor_temperature(100, 101), 1)
})
# Changing variables imported from modules (monkey patching) is not possible in Julia.
# To be able to test this function properly it needs to be made pure:
function check_reactor_temperature(temperature_celsius, max_temperature)
if temperature_celsius > max_temperature
status = 1
else
status = 0
end
end
# normal invocation:
using reactor: max_temperature
check_reactor_temperature(99, max_temperature)
# tests
@testset "Test check_reactor_temperature function" begin
@test check_reactor_temperature(99, 100) == 0
@test check_reactor_temperature(100, 100) == 0 # boundary cases easily go wrong
@test check_reactor_temperature(101, 100) == 1
end
! also here, not easy to test and better rewrite the function
Design-5: Design a test for a method of a mutable class
class Pet:
def __init__(self, name):
self.name = name
self.hunger = 0
def go_for_a_walk(self): # <-- how would you test this function?
self.hunger += 1
#include <string>
class Pet {
private:
unsigned int hunger_{0};
std::string name_{};
public:
explicit Pet(std::string name) : name_(name) {}
void go_for_a_walk() { hunger_ += 1; }
// ^-- how would you test this function?
unsigned int hunger() const { return hunger_; }
};
Pet <- function(name) {
structure(
list(name = name, hunger = 0),
class = "Pet"
)
}
# How would you test this function?
take_for_a_walk <- function(pet) UseMethod("take_for_a_walk")
take_for_a_walk.Pet <- function(pet) {
pet$hunger <- pet$hunger + 1
pet
}
# the closest thing to a class in Julia is a `mutable struct`
mutable struct Pet
name::String
hunger::Int64
end
function go_for_a_walk!(pet::Pet)
pet.hunger += 1
end
! missing - please submit an example
Solution
def test_pet():
p = Pet('fido')
assert p.hunger == 0
p.go_for_a_walk()
assert p.hunger == 1
p.hunger = -1
p.go_for_a_walk()
assert p.hunger == 0
#include <catch2/catch.hpp>
#include "pet.hpp"
TEST_CASE("Check my pets", "[pets]") {
auto fido = Pet("fido");
REQUIRE(fido.hunger() == 0);
fido.go_for_a_walk();
REQUIRE(fido.hunger() == 1);
}
test_that("Test Pet class", {
p <- Pet(name = "fido")
expect_equal(p$hunger, 0)
p <- take_for_a_walk(p)
expect_equal(p$hunger, 1)
p$hunger <- -1
p <- take_for_a_walk(p)
expect_equal(p$hunger, 0)
})
# create the mutable struct and test it
@testset "Test mutable struct" begin
p = Pet("fido", 0)
@test p.hunger == 0
go_for_a_walk!(p)
@test p.hunger == 1
p.hunger = -1
go_for_a_walk!(p)
@test p.hunger == 0
end
! missing - please submit an example
Test-driven development
Design-6: Experience test-driven development
Write a test before writing the function! You can decide yourself what your unwritten function should do, but as a suggestion it can be based on FizzBuzz - i.e. a function that:
takes an integer argument
for arguments that are multiples of three, returns “Fizz”
for arguments that are multiples of five, returns “Buzz”
for arguments that are multiples of both three and five, returns “FizzBuzz”
fails in case of non-integer arguments or integer arguments 0 or negative
otherwise returns the integer itself
When writing the tests, consider the different ways that the function could and should fail.
After you have written the tests, implement the function and run the tests until they pass.
Solution
import pytest
def fizzbuzz(number):
if not isinstance(number, int):
raise TypeError
if number < 1:
raise ValueError
elif number % 15 == 0:
return "FizzBuzz"
elif number % 3 == 0:
return "Fizz"
elif number % 5 == 0:
return "Buzz"
else:
return number
def test_fizzbuzz():
expected_result = [1, 2, "Fizz", 4, "Buzz", "Fizz",
7, 8, "Fizz", "Buzz", 11, "Fizz",
13, 14, "FizzBuzz", 16, 17, "Fizz", 19, "Buzz"]
obtained_result = [fizzbuzz(i) for i in range(1, 21)]
assert obtained_result == expected_result
with pytest.raises(ValueError):
fizzbuzz(-5)
with pytest.raises(ValueError):
fizzbuzz(0)
with pytest.raises(TypeError):
fizzbuzz(1.5)
with pytest.raises(TypeError):
fizzbuzz("rabbit")
def main():
for i in range(1, 100):
print(fizzbuzz(i))
if __name__ == "__main__":
main()
// in the fizzbuzz.hpp header file
#include <string>
std::string fizzbuzz(unsigned int n) {
if (n % 15 == 0) {
return "FizzBuzz";
} else if (n % 3 == 0) {
return "Fizz";
} else if (n % 5 == 0) {
return "Buzz";
} else {
return std::to_string(n);
}
}
// in the source file for the main executable
#include <cstdlib>
#include <iostream>
int main() {
for (auto i = 0; i < 100; ++i) {
std::cout << fizzbuzz(i) << std::endl;
}
}
// in the source file for the test
#include <string>
#include <catch2/catch.hpp>
#include "fizzbuzz.hpp"
TEST_CASE("FizzBuzz", "[fizzbuzz]") {
auto expected = std::vector<std::string>{
"1", "2", "Fizz", "4", "Buzz", "Fizz", "7",
"8", "Fizz", "Buzz", "11", "Fizz", "13", "14",
"FizzBuzz", "16", "17", "Fizz", "19", "Buzz"};
for (auto i = 1; i <= 21; ++i) {
REQUIRE(fizzbuzz(i) == expected[i]);
}
}
# define the function
fizz_buzz <- function(number){
if(!number%%1==0 | number < 0) {
stop("non-integer or negative input not allowed!")
}
if(number%%3 == 0 & number%%5 == 0) {
return('FizzBuzz')
}
else if(number%%3 == 0) {
return('Fizz')
}
else if (number%%5 == 0){
return('Buzz')
}
else {
return(number)
}
}
# apply it to the numbers 1 to 50
for (number in 1:50) {
print(fizz_buzz(number))
}
library(testthat)
test_that("Test FizzBuzz", {
expect_equal(fizz_buzz(1), 1)
expect_equal(fizz_buzz(2), 2)
expect_equal(fizz_buzz(3), 'Fizz')
expect_equal(fizz_buzz(4), 4)
expect_equal(fizz_buzz(5), 'Buzz')
expect_equal(fizz_buzz(15), 'FizzBuzz')
expect_error(fizz_buzz(-1))
expect_error(fizz_buzz(1.5))
expect_error(fizz_buzz('rabbit'))
})
using Test
function fizzbuzz(number::Int)
if number < 1
throw(DomainError(number, "number needs to be 1 or higher"))
elseif number % 15 == 0
return "FizzBuzz"
elseif number % 3 == 0
return "Fizz"
elseif number % 5 == 0
return "Buzz"
else
return number
end
end
@testset begin
expected_result = [1, 2, "Fizz", 4, "Buzz", "Fizz",
7, 8, "Fizz", "Buzz", 11, "Fizz",
13, 14, "FizzBuzz", 16, 17, "Fizz", 19, "Buzz"]
obtained_result = [fizzbuzz(i) for i in 1:20]
@test obtained_result == expected_result
@test_throws MethodError fizzbuzz(1.5)
@test_throws DomainError fizzbuzz(0)
@test_throws DomainError fizzbuzz(-5)
end
for i in 1:20
println(fizzbuzz(i))
end
module fizzbuzz_mod
contains
! Evaluates the fizzbuzz of n
character(8) function fizzbuzz(n)
implicit none
integer, intent(in) :: n
character(8) str1
if( mod(n,15)==0 ) then
str1='FizzBuzz'
else if( mod(n,3)==0 ) then
str1='Fizz'
str1=str1
else if( mod(n,5)==0 ) then
str1='Buzz'
else if( n<0 ) then
str1='Err n<0'
else
str1=char(n)
end if
fizzbuzz=str1
end function fizzbuzz
end module fizzbuzz_mod
@test
subroutine test_fizzbuzz()
use fizzbuzz_mod
use funit
@assertEqual('Fizz', fizzbuzz(12), 'fizzbuzz(12)')
@assertEqual('Buzz', fizzbuzz(25), 'fizzbuzz(25)')
@assertEqual('FizzBuzz', fizzbuzz(60), 'fizzbuzz(60)')
end subroutine test_fizzbuzz
Testing randomness
How would you test functions which generate random numbers according to some distribution/statistics?
Functions and modules which contain randomness are more difficult to test than pure deterministic functions, but many strategies exist:
For unit tests we can use fixed random seeds.
Try to test whether your results follow the expected distribution/statistics.
When you verify your code “by eye”, what are you looking at? Now try to express that in a script.
Design-7: Write two different types of tests for randomness
Consider the code below which simulates playing Yahtzee by using random numbers. How would you go about testing it?
Try to write two types of tests:
a unit test for the
roll_dice
function. Since it uses random numbers, you will need to set the random seed, pre-calculate what sequence of dice throws you get with that seed, and use that in your test.a test of the
yahtzee
function which considers the statistical probability of obtaining a “Yahtzee” (5 dice with the same value after three throws), which is around 4.6%. This test will be an integration test since it tests multiple functions including the random number generator itself.
import random
from collections import Counter
def roll_dice(num_dice):
return [random.choice([1, 2, 3, 4, 5, 6]) for _ in range(num_dice)]
def yahtzee():
"""
Play yahtzee with 5 6-sided dice and 3 throws.
Collect as many of the same dice side as possible.
Returns the number of same sides.
"""
# first throw
result = roll_dice(5)
most_common_side, how_often = Counter(result).most_common(1)[0]
# we keep the most common side
target_side = most_common_side
num_same_sides = how_often
if num_same_sides == 5:
return 5
# second and third throw
for _ in [2, 3]:
throw = roll_dice(5 - num_same_sides)
num_same_sides += Counter(throw)[target_side]
if num_same_sides == 5:
return 5
return num_same_sides
if __name__ == "__main__":
num_games = 100
winning_games = list(
filter(
lambda x: x == 5,
[yahtzee() for _ in range(num_games)],
)
)
print(f"out of the {num_games} games, {len(winning_games)} got a yahtzee!")
#include <cstdlib>
#include <iostream>
#include <random>
#include <tuple>
#include <vector>
/* Roll a fair die n_dice times. The faces of the die can be set (default is 1 to 6).
* The PRNG engine is moved in the function such that changes in its state are propagated back to the caller.
*/
template <typename PRNGEngine = decltype(std::default_random_engine())>
std::vector<unsigned int> roll_dice(
unsigned int n_dice = 5,
std::vector<unsigned int> faces = {1, 2, 3, 4, 5, 6},
PRNGEngine&& gen = std::default_random_engine(std::random_device()())) {
// create a fair die
auto weights = std::vector<double>(faces.size(), 1.0);
auto fair_dice =
std::discrete_distribution<unsigned int>(weights.begin(), weights.end());
auto rolls = std::vector<unsigned int>(n_dice, 0);
for (auto i = 0u; i < n_dice; ++i) {
rolls[i] = faces[fair_dice(gen)];
}
return rolls;
}
/* count how many times each face comes up */
std::vector<unsigned int> count(const std::vector<unsigned int>& toss,
const std::vector<unsigned int>& faces = {
1, 2, 3, 4, 5, 6}) {
auto face_counts = std::vector<unsigned int>(faces.size(), 0);
for (auto i = 0; i < faces.size(); ++i) {
face_counts[i] = std::count(toss.cbegin(), toss.cend(), faces[i]);
}
return face_counts;
}
std::tuple<unsigned int, unsigned int> yahtzee() {
auto n_dice = 5;
auto faces = std::vector<unsigned int>{1, 2, 3, 4, 5, 6};
auto toss = [faces](unsigned int n_dice) { return roll_dice(n_dice, faces); };
// throw all dice
auto first_toss = toss(n_dice);
auto face_counts = count(first_toss);
auto it_max = std::max_element(face_counts.cbegin(), face_counts.cend());
// number of faces that showed the most
auto n_collected = *it_max;
// corresponding index in the array, will be used to get which face showed up
// the most
auto idx_max = std::distance(face_counts.cbegin(), it_max);
// all 5 dice showed the same face! YAHTZEE!
if (n_collected == 5) return std::make_tuple(faces[idx_max], n_collected);
// no yahtzee :(
// we throw (n_dice - n_collected) dice
auto second_toss = toss(n_dice - n_collected);
n_collected += count(second_toss, {faces[idx_max]})[0];
// YAHTZEE!
if (n_collected == 5) return std::make_tuple(faces[idx_max], n_collected);
// final chance
auto third_toss = toss(n_dice - n_collected);
n_collected += count(third_toss, {faces[idx_max]})[0];
return std::make_tuple(faces[idx_max], n_collected);
}
int main() {
for (auto i = 0; i < 100; ++i) {
unsigned int value = 0, times = 0;
std::tie(value, times) = yahtzee();
std::cout << "We got " << value << " " << times << " times in round " << i
<< std::endl;
if (times == 5) {
std::cout << "YAHTZEE in round " << i << std::endl;
}
}
return EXIT_SUCCESS;
}
#' Roll a fair die num_dice times.
#' Collect as many of the same dice side as possible.
#'
#' @return A vector of die throw results
roll_dice <- function(num_dice) {
ceiling(runif(num_dice, 0, 6))
}
#' Play yahtzee with 5 6-sided dice and 3 throws.
#' Collect as many of the same dice side as possible.
#'
#' @return The number of same sides
yahtzee <- function() {
res <- roll_dice(5)
most_common_side <- as.numeric(names(table(res)[which.max(table(res))]))[1]
how_often <- as.vector(table(res)[which.max(table(res))])[1]
# we keep the most common side
target_side <- most_common_side
num_same_sides <- how_often
if (num_same_sides == 5) {
return(5)
}
# second and third throw
for (i in 2:3) {
throw <- roll_dice(5 - num_same_sides)
num_same_sides <- num_same_sides + sum(throw == target_side)
if (num_same_sides == 5) {
return(5)
}
}
return(num_same_sides)
}
using Random
"""
roll_dice(n_dice::Int=5, sides=(1,2,3,4,5,6))
Returns array of n_dice random integers corresponding to the sides of dice
"""
function roll_dice(n_dice::Int=5, sides=(1,2,3,4,5,6))
return [rand(sides) for i in 1:n_dice]
end
"""
yahtzee()
Play Yahtzee with 5 6-sided dice and 3 throws.
Collect as many of the same dice side as possible.
Returns a tuple with the collected side (e.g. 4's) and
how many of that side (between 1 and 5).
"""
function yahtzee()
sides = (1,2,3,4,5,6)
n_dice = 5
# we first throw all dice
first_throw = roll_dice(n_dice, sides)
# count how many times each side comes up
side_counts = [count(x->x==i,first_throw) for i in sides]
# collected_side is the dice side we will start collecting
n_collected, collected_side = findmax(side_counts)
if n_collected == 5
return collected_side, n_collected
end
# now we throw n_dice-n_collected dice and hope to get more collected_side
second_throw = roll_dice(n_dice-n_collected, sides)
n_new_matches = count(x->x==collected_side,second_throw)
n_collected += n_new_matches
if n_collected == 5
return collected_side, n_collected
end
# final throw...
third_throw = roll_dice(n_dice-n_collected, sides)
n_new_matches = count(x->x==collected_side,third_throw)
n_collected += n_new_matches
return collected_side, n_collected
end
for i in 1:10
collected_side, n_collected = yahtzee()
println("We got $n_collected $collected_side's in this round")
if n_collected == 5
println("Yay, it's a Yahtzee!")
end
end
Solution
def test_roll_dice():
random.seed(0)
assert roll_dice(5) == [4, 4, 1, 3, 5]
assert roll_dice(5) == [4, 4, 3, 4, 3]
assert roll_dice(5) == [5, 2, 5, 2, 3]
import pytest
def test_yahtzee():
random.seed(1)
n_tests = 1_000_000
winning_games = list(
filter(
lambda x: x == 5,
[yahtzee() for _ in range(n_tests)],
)
)
assert len(winning_games) / n_tests == pytest.approx(0.046, abs=0.01)
#include <random>
#include <catch2/catch.hpp>
#include "yahtzee.hpp"
using namespace Catch::literals;
TEST_CASE("Tossing 5 dice", "[toss]") {
auto n_dice = 5;
auto faces = std::vector<unsigned int>{1, 2, 3, 4, 5, 6};
// we fix both the random device and the random engine.
// the latter is necessary since the default random engine is implementation-dependent
std::mt19937 prng;
prng.seed(1234);
auto expected_1 = std::vector<unsigned int>{3, 5, 4, 5, 6};
auto toss_1 = roll_dice(n_dice, faces, prng);
REQUIRE(toss_1 == expected_1);
auto expected_2 = std::vector<unsigned int>{1, 2, 5, 1, 1};
auto toss_2 = roll_dice(n_dice, faces, prng);
REQUIRE(toss_2 == expected_2);
auto expected_3 = std::vector<unsigned int>{1, 3, 2, 5, 1};
auto toss_3 = roll_dice(n_dice, faces, prng);
REQUIRE(toss_3 == expected_3);
}
TEST_CASE("Distribution of Yahtzee", "[yahtzee]") {
// try a million throws and see if we get close
// to the statistical average of 4.6%
// (https://en.wikipedia.org/wiki/Yahtzee#Probabilities)
auto n_yahtzees = 0;
auto n_trials = 1e6;
for (auto i = 0; i < n_trials; ++i) {
unsigned int value = 0, times = 0;
std::tie(value, times) = yahtzee();
if (times == 5) {
++n_yahtzees;
}
}
REQUIRE( n_yahtzees / n_tests == 0.046_a)
}
if (!require(testthat)) install.packages(testthat)
test_that("dice is working", {
set.seed(42)
expect_equal(roll_dice(5), c(6, 6, 2, 5, 4))
expect_equal(roll_dice(5), c(4, 5, 1, 4, 5))
expect_equal(roll_dice(5), c(3, 5, 6, 2, 3))
})
test_that("yahtzee is working", {
n_games <- 1e6 #could be very slow, when in doubt change to 1e4
n_winning <- sum(replicate(n_games, yahtzee()) == 5)
res <- (n_winning / n_games)
expected <- 0.046
expect_lt(abs(res - expected), 0.003)
})
using Test
using Random
include("yahtzee.jl")
@testset "test roll_dice" begin
# fix random seed
Random.seed!(1234);
# known sequence for given seed
exp_first_throw = [1, 3, 5, 1, 6]
exp_second_throw = [1, 6, 4, 5, 4]
exp_third_throw = [1, 2, 5, 3, 2]
n_dice = 5
sides=(1,2,3,4,5,6)
throw = roll_dice(n_dice, sides)
@test throw == exp_first_throw
throw = roll_dice(n_dice, sides)
@test throw == exp_second_throw
throw = roll_dice(n_dice, sides)
@test throw == exp_third_throw
end
@testset "testing yahtzee" begin
# try a million throws and see if we get close
# to the statistical average of 4.6%
# (https://en.wikipedia.org/wiki/Yahtzee#Probabilities)
n_yahtzees = 0
n_tests = 1_000_000
for i in 1:n_tests
collected_side, n_collected = yahtzee()
if n_collected == 5
n_yahtzees += 1
end
end
@test n_yahtzees/n_tests ≈ 0.046 atol=0.001
end
Designing an end-to-end test
In this exercise we will practice designing an end-to-end test. In an end-to-end test (or integration test), the unit is the entire program. We typically feed the program with some well defined input and verify that it still produces the expected output by comparing it to some reference.
Design-8: Design (but not write) an end-to-end test for the uniq program
To have a tangible example, let us consider the uniq
command. This command
can read a file or an input stream and remove consecutive repetition.
The program behind uniq
has been written by somebody else, it probably contains
some functions, but we will not look into it but regard it as “black box”.
If we have a file called repetitive-text.txt
containing:
(all together now) all together now
(all together now) all together now
(all together now) all together now
(all together now) all together now
(all together now) all together now
another line
another line
another line
another line
intermission
more repetition
more repetition
more repetition
more repetition
more repetition
(all together now) all together now
(all together now) all together now
… then feeding this input file to uniq
like this:
$ uniq < repetitive-text.txt
… will produce the following output with repetitions removed:
(all together now) all together now
another line
intermission
more repetition
(all together now) all together now
How would you write an end-to-end test for uniq
?
Design-9: More end-to-end testing
Now imagine a code which reads numbers and produces some (floating point) numbers. How would you test that?
How would you test a code end-to-end which produces images?
Design-10: Create an actual end-to-end test
Often, you can include tests that run your whole workflow or program. For example, you might include sample data and check the output against what you expect. (including sample data is a great idea anyway, so this helps a lot!)
We’ll use the word-count example repository https://github.com/coderefinery/word-count.
As a reminder, you can run the script like this to get some output, which prints to standard output (the terminal):
$ python3 code/count.py data/abyss.txt
Your goal is to make a test that can run this and let you know if it’s
successful or not. You could use Python, or you could use shell
scripting. You can test if these two lines are in the output: the 4044
and and 2807
.
Python hint: subprocess.check_output will run a command and return its output as a string.
Bash hint: COMMAND | grep "PATTERN"
(“pipe to grep”) will be true if
the pattern is in the command.
Solution
There are two solutions in the repository already, in the tests/
dierectory https://github.com/coderefinery/word-count, one in Python
and one in bash shell. Neither of these are a very advanced or
perfect solution, and you could integrate them with pytest or whatever
other test framework you use.
The shell one works with shell and prints a bit more output:
$ sh tests/end-to-end-shell.sh
the 4044
Success: 'the' found correct number of times
and 2807
Success: 'and' found correct number of times
Success
The Python one:
$ python3 tests/end-to-end-python.py
Success
Keypoints
Pure functions (these are functions without side-effects) are easiest to test. And also easiest to reuse in another code project since they don’t depend on any side-effects.
Classes can be tested but it’s somewhat more elaborate.