Testing

Testing strategies Examples

Testing

Clancy Rowley and Robert Lupton

2014-07-27

Testing strategies Examples

Testing vs. debugging

Testing

Orderly, systematic, exhaustive checks on whether a program

is working properly

test as you write the code

test systematically

automate testing

Testing process begins when program is designed

design an interface with testing in mind

continues as program evolves

never ends

Debugging

Efﬁciently ﬁguring out what’s wrong

never ends either

Testing strategies Examples

Test ﬁrst or last?

Test last

Write a code, then write tests to ﬁnd the bugs

Test ﬁrst (TDD: Test Driven Development)

Write tests before writing the code being tested

Advantages

Ensures tests are working (all should fail initially)

See progress as you implement more features

Can detect problems with the interface before spending time on

an implementation (and possibly coding the wrong thing)

Testing strategies Examples

Test as you write the code

Many errors can be eliminated before they happen, by careful

coding

Check boundary conditions

Look for off-by-one errors

State and check pre-conditions

Add assertions for post-conditions and invariants

Defensive programming

Check error returns

Turn on all compiler checks

Testing strategies Examples

Systematic testing

Test incrementally

Test each new feature as it’s implemented

Test each new code as it’s written

Test simple parts ﬁrst

Basic features before fancy ones

Know what output to expect

Easiest if test inputs and outputs are in ﬁles and can be run

automatically

Verify conservation proper ties

Preserves number of items, sizes of data structures, order

relationships, etc

Compare independent implementations

Brute force, inefﬁcient version with more complex, fast version

Measure test coverage

Use tools such as tcov to generate coverage information

Add test cases to go through each branch of code

It’s hard to achieve high coverage

Testing strategies Examples

Mechanization: let the computer do the work

Write code by programs

specialized languages: regular expressions,. . .

Use executable speciﬁcations

avoid writing the same information multiple times

capture the varying part in speciﬁcation ﬁles

Do clerical jobs by programs

build, test, system administration,. . .

Test programs by programs

hand testing is too slow and prone to error

Main point

When the job is mechanical, it’s time to mechanize.

Testing strategies Examples

Example: testing binary search

while True:

line = raw

input("Enter size, item: ")

try:

n, item = [int(i) for i in line.split()]

except:

break

x = [10

a for a in range(n)]

print " %d" % binary

search(x, item)

Usage:

Enter size, item: 5 0

Enter size, item: 5 10

Enter size, item: 5 40

Enter size, item: 5 5

-1

Enter size, item: 5 -5

-1

Enter size, item: 5 50

-1

Enter size, item: 5 45

-1

Testing strategies Examples

Test automation

Test scaffolds

shell scripts, programs, makeﬁles, etc, that run tests and

compare results automatically

Automated regression testing

Does the new version get the same answer as the old version?

And in about the same amount of time?

If it doesn’t, explain why

Self-contained tests

Compare to known output or independently created outputs

Stress tests

large inputs

random inputs

perverse inputs

Testing strategies Examples

Example: testing AWK

Brian Kernighan maintains the original version of AWK, and a

comprehensive test suite

Around 1000 tests, run with a single command

Regression tests

Does it work the same as the previous version does?

Known output tests

Does it produce the same output as an independent mechanism?

Bug ﬁx tests

A test for each bug and new feature

Stress tests

Does it handle perverse, huge, illegal, random cases?

Coverage tests

Are all statements executed by some test?

Performance tests

Does it run at the same speed or better?

Testing strategies Examples

Using AWK for testing regexp code

Regular expression tests are described in a very small

specialized language

^a.$ ~ ax

!~ xa

aaa

axy

Each test is converted into a command that exercises the new

version:

$ echo ’ax’ | awk ’!/^a.$/ {print "bad"}’

Illustrates

little languages

programs that write programs

mechanization

Python version

Testing strategies Examples

Automation of binary search test, version 1

tests = """\

0 0 -1

1 0 0

1 10 -1

1 -5 -1

5 0 0

5 10 1

5 40 4

5 5 -1

5 -5 -1

5 50 -1

5 45 -1"""

for line in tests.split(’\n’):

(size, item, expected) = [int(i) for i in line.split()]

x = [10

i for i in range(size)]

assert(binary

search(x, item) == expected)

Nice, compact language; easy to specify new tests

Still somewhat error prone: need to list each test by hand

Testing strategies Examples

Automation of binary search test, version 2

Generate an array of a given size, and try searching for all

elements:

for size in range(100):

# test array of the form [0, 10, 20, 30, ...]

x = [10

i for i in range(size)]

for i in range(size):

assert(binary

search(x, x[i]) == i)

assert(binary

search(x, x[i] - 5) == -1)

assert(binary

search(x, 10

size) == -1)

assert(binary

search(x, 10

size - 5) == -1)

Test arrays with equal elements:

# test an array with equal elements: [10, 10, 10, ...]

x = size

[10]

if size == 0:

assert(binary

search(x, 10) == -1)

else:

assert(binary

search(x, 10) in range(size))

assert(binary

search(x, 5) == -1)

assert(binary

search(x, 15) == -1)

Compact, clear code.

Drawback: assert stops execution after ﬁrst test fails.

Potentially need to run the test code n times to ﬁnd n bugs.

Would be nice to keep a count of number of tests passed/failed.

Testing strategies Examples

Automation of binary search test, version 3

Keep a count of number of tests passed and failed.

Function to check a single element, and print an error message if

binary

search does not return the expected value:

num

tests

passed = 0

num

tests

failed = 0

def check(condition, msg):

"""Record test passed or failed if status True/False,

and print msg if failed"""

global num

tests

passed

global num

tests

failed

if condition:

num

tests

passed += 1

else:

num

tests

failed += 1

print "Test failed: %s" % msg

def check

item(x, item, expected):

"""Check that binary

search(x, item) == expected"""

ind = binary

search(x, item);

msg = "binary

search(x, %d) = %d, expected %d\n x = %s" \

% (item, ind, expected, str(x))

check(ind == expected, msg)

Testing strategies Examples

Automation of binary search test, version 3

Driver program:

for size in range(100):

# test array of the form [0, 10, 20, 30, ...]

x = [10

i for i in range(size)]

for i in range(size):

check

item(x, x[i], i)

check

item(x, x[i] - 5, -1)

check

item(x, 10

size, -1)

check

item(x, 10

size - 5, -1)

# test an array with equal elements: [10, 10, 10, ...]

x = size

[10]

if size == 0:

check

item(x, 10, -1)

else:

ind = binary

search(x, 10)

msg = "binary

search(x, 10) = %d, expected [0..%d)\n x = %s" \

% (ind, size, str(x))

check(ind in range(size), msg)

check

item(x, 5, -1)

check

item(x, 15, -1)

print "%d passed, %d failed" % (num

tests

passed, num

tests

failed)

Would be better to replace global variable by a variable inside a

class

Testing strategies Examples

Automation of binary search test, version 3

Sample output (buggy version):

Test failed: binary

search(x, 0) = -1, expected 0

x = [0]

Test failed: binary

search(x, 10) = -1, expected [0..1)

x = [10]

Test failed: binary

search(x, 10) = -1, expected 1

x = [0, 10]

Test failed: binary

search(x, 0) = -1, expected 0

x = [0, 10, 20]

Test failed: binary

search(x, 20) = -1, expected 2

x = [0, 10, 20]

27 passed, 5 failed

Sample output (working version):

32 passed, 0 failed

Testing strategies Examples

Automation of binary search test, version 4 I

class Tester(object):

max

size = 100

def

init

(self):

self.

num

passed = 0

self.

num

failed = 0

def

check(self, condition, msg):

"""Record test passed or failed, and print msg if failed"""

if condition:

self.

num

passed += 1

else:

self.

num

failed += 1

print "Test failed: %s" % msg

def

check

item(self, x, item, expected):

"""Check that binary

search(x, item) == expected"""

ind = binary

search(x, item);

msg = "binary

search(x, %d) = %d, expected %d\n x = %s" \

% (item, ind, expected, str(x))

self.

check(ind == expected, msg)

def test

array(self):

"""Check all possible values in an array [0, 10, 20,...]"""

for size in range(self.

max

size):

x = [10

i for i in range(size)]

for i in range(size):

self.

check

item(x, x[i], i)

self.

check

item(x, x[i] - 5, -1)

Testing strategies Examples

Automation of binary search test, version 4 II

self.

check

item(x, 10

size, -1)

self.

check

item(x, 10

size - 5, -1)

def test

equal

elems(self):

"""Check values in an array [10, 10, 10,...]"""

for size in range(self.

max

size):

x = size

[10]

if size == 0:

self.

check

item(x, 10, -1)

else:

ind = binary

search(x, 10)

msg = "binary

search(x, 10) = %d, expected [0,%d)\n x = %s" \

% (ind, size, str(x))

self.

check(ind in range(size), msg)

self.

check

item(x, 5, -1)

self.

check

item(x, 15, -1)

def print

status(self):

print "%d passed, %d failed" % (self.

num

passed, self.

num

failed)

def run

tests(self):

self.test

array()

self.test

equal

elems()

self.print

status()

Driver program:

Tester().run

tests()

Testing strategies Examples

JUnit family of test frameworks

Surely someone else must have had the same ideas?

Java has a built-in facility for automating tests: JUnit

It has been copied in several other languages:

Google C++ testing framework:

https://github.com/google/googletest

python unittest (part of standard library)

Features

Choose different output formats for test results

Can write set-up and tear-down functions that are called before

and after each test

Object-oriented (use inheritance to avoid duplicate test code)

Fairly standardized

Testing strategies Examples

Automation of binary search test, version 5 I

import unittest

class TestBinarySearch(unittest.TestCase):

def setUp(self):

self.sizes = range(100)

def check

item(self, x, item, expected):

result = binary

search(x, item)

msg = "binary

search(x, %d) = %d, expected %d\nx = %s" % \

(item, result, expected, str(x))

self.assertEqual(result, expected, msg)

def test

all(self):

"""Search for all items in an array"""

for size in self.sizes:

x = [10

i for i in range(size)]

for i in range(size):

self.check

item(x, x[i], i)

self.check

item(x, x[i] - 5, -1)

self.check

item(x, 10

size, -1)

self.check

item(x, 10

size - 5, -1)

def test

equal

elems(self):

"""Search an array whose elements are all equal"""

for size in self.sizes:

x = size

[10]

if size == 0:

self.check

item(x, 10, -1)

else:

Testing strategies Examples

Automation of binary search test, version 5 II

result = binary

search(x, 10)

msg = "binary

search(x, %d) = %d, expected [0,%d)\nx = %s" % \

(10, result, size, x)

self.assert

(result in range(0,size), msg)

self.check

item(x, 5, -1)

self.check

item(x, 15, -1)

Driver program:

unittest.main()

Output (working version):

----------------------------------------------------------------------

Ran 2 tests in 0.029s

Testing strategies Examples

Automation of binary search test, version 5

Output (buggy version):

======================================================================

FAIL: test

all (

main

.TestBinarySearch)

Search for all items in an array

----------------------------------------------------------------------

Traceback (most recent call last):

File "src/binsearch.py", line 238, in test

all

self.check

item(x, x[i], i)

File "src/binsearch.py", line 231, in check

item

self.assertEqual(result, expected, msg)

AssertionError: binary

search(x, 0) = -1, expected 0

x = [0]

======================================================================

FAIL: test

equal

elems (

main

.TestBinarySearch)

Search an array whose elements are all equal

----------------------------------------------------------------------

Traceback (most recent call last):

File "src/binsearch.py", line 253, in test

equal

elems

self.assert

(result in range(0,size), msg)

AssertionError: binary

search(x, 10) = -1, expected [0,1)

x = [10]

----------------------------------------------------------------------

Ran 2 tests in 0.001s

FAILED (failures=2)

Testing strategies Examples

Skipping Tests

It’s often convenient to skip some tests; the best way to do this is

using one of unittest’s decorators:

@unittest.skip("demonstrating skipping")

def test

nothing(self):

self.fail("shouldn’t happen")

There are also @unittest.skipIf and @unittest.skipUnless;

they can also be used to skip entire classes of tests.

Another useful decorator is @unittest.expectedFailure —

especially useful as it’s all too easy to keep skipping tests even after

you’ve made them pass.

Testing strategies Examples

Python version of AWK regexp test

Recall compact tests for regular expressions in AWK

Previous slide

Can do similar things with unittest in Python:

#!/usr/bin/env python

import re, unittest

class RegexpTestCase(unittest.TestCase):

"""A test fixture for regexp products"""

def testRe(self):

tests = {"^a.$" : (["ax", "aa"], ["xa", "aaa", "axy"]),

}

for p in tests.keys():

good, bad = tests[p]

for s in good:

self.assertTrue(re.search(p, s))

for s in bad:

self.assertFalse(re.search(p, s))

name

== "

main

unittest.main()

Testing strategies Examples

Unit tests v. Integration tests v. Regression tests

There are a number of types of testing that you need to worry

about:

Unit testing. Do your pieces of code (classes, functions) work

correctly?

Integration testing. Do the units work properly together?

Regression testing: Do things still work?

Acceptance tests: Will you be paid/graduate/get tenure?

Because unit tests are supposed to test code locally, you may want

to mock other parts of the system (e.g. if you’re testing an object

ﬁnder that works in smoothed images, you don’t want to test the I/O

and convolution parts of the code). In python, unittest.mock

makes this (relatively) easy; in python3, available for python 2.7.

Testing strategies Examples

Issues with testing scientiﬁc codes

These are more about integration than unit tests.

Floating point doesn’t give exact answers

Use assertAlmostEqual() or equivalent (EXPECT

DOUBLE

EQ()

for Google C++ test)

Always compare against a known answer

Not always easy for scientiﬁc codes: even if we know an exact

solution for a particular problem, our numerical methods are

often only approximations. What should our error tolerance be?

If you know a numerical method is exact for a certain type of

input (for instance, an interpolation algorithm on a collinear set of

points), use that as a test problem.

Comparing against a previous, trusted version can be useful here

Testing strategies Examples

Continuous Integration (CI)

Rather than run your tests by hand, it’s convenient to use some

automated framework. Unit tests are easily run as part of your

regular build (e.g. via a make test target in a Makeﬁle), but in

larger systems you probably want to run larger scale tests

automatically too.

Popular frameworks are buildbot, jenkins, and travis-CI.

LSST uses jenkins to periodically check the state of master (and

some of the repos use travis as a pre-commit check)

Testing strategies Examples

Summary

Test your code

Automate, automate, automate

Write programs that write programs: leverage

Design clean interfaces that make your code easy to test

Consider writing tests before you write your code

Whenever you ﬁnd a bug, write a test that exposes it

Whenever you add a feature, write a test that exercises it

Look at available frameworks for organizing tests

Think about using a CI framework too