Release then Test

2007-02-13 • C++, Python, Self • Comments

A comment on this blog by David P Thomas got me thinking about testing released software. Traditionally, software developers concentrate on testing software back in the lab, arguing that once it’s released into the big wide world it’s a little bit late to start finding problems. In this article, I beg to differ.

C++ Assertions

A classic example would be the use of assert in C++. The standard assert is a macro which the preprocessor expands in different ways depending on whether the code is compiled in “release” or “debug” mode. In debug mode an assertion which fails causes the program to abort. In release mode (i.e. when the compiler flag -DNDEBUG is set), the assertion is simply eliminated by the preprocessor and can therefore have no effect.

What use is this evanescent assert to the end user of our release? None whatsoever. What use is it to the software developer once the code has been compiled into a releasable binary? None whatsoever.

The whole business of maintaining two versions of the software (one for use by developers, another for users) is problematic, and especially so in the case of assertions. Let’s say, for example, the assertion checks an array index is in range before we write to that array. During development the assertion never fires — and we may convince ourselves there’s no logical way it can fire and that therefore the check can safely be removed for production 1. Under extreme circumstances things may not be quite so definite. If a resource runs out, if something gets corrupted, if an aberrant set of inputs causes an untested and unforeseen code path to be exercised; if any of these things happens then it may well be that the assertion would fire, causing the program to abort before any further damage could be done to the user’s data or indeed itself.

Such extreme behaviour is far from ideal for the end user and you could argue our test regime failed us. Nonetheless, it seems silly to strip such potentially helpful code before we ship. If the product does fail in this way and ends up being returned at least we should have diagnostic information about where things went wrong.

Here’s an interesting paper in which Miro Samek expands on this argument, likening assertions to fuses designed to protect the system.

Sanity Checks

Let’s suppose our product is a server-based piece of software. We assemble this server ourselves and during assembly we run a series of sanity checks to confirm that the hardware functions correctly. Once we’re satisfied everything is in working order, we remove all traces of these experiments, finish the installation, and ship the server.

Surely it would be better to leave the tests in place so they can be run when the server is unpacked at the other end of its journey? There may have been some damage in transit. There may well be damage at some later point when the unit is moved to a new server room. In either case, we can add value by including the test code with no extra cost to ourselves.

Regression Tests

Similarly, if our product is a desktop application, we probably have a set of system tests we run it through regularly to check there hasn’t been a regression somewhere. Why not include these tests as part of the standard product installation? It might help diagnose a problem if the product is installed on a platform which differs in some crucial way to the ones we used during development.

Unit Tests

If our product is a library delivered either as source code, or perhaps pre-compiled library, we need to also ship documentation showing how to use this code. One part of this documentation should be in the form of test code; typically a straightforward set of recipes demonstrating the correct use of the library. As well as documenting normal use, the tests will also stress the library and confirm it handles boundary conditions and malformed inputs. With a little extra effort, we can supply a wrapper program which runs through all these tests and summarises their results.

In this particular case, the benefits are obvious: the library and tests combine into a single package which serves both the original developers and the happy new owners equally well.

This is hardly an original idea but perhaps it’s one which the open source movement has helped promote. As an example: if I’m trying to figure out how to use a particular Python library, one of the first places I look is in the test directory. Here, for example, are the tests for the socket module.

If, for some reason, I suspect my Python installation is broken, I can test it as follows:

$ python Lib/test/regrtest.py

1 (Note, though, that we shy from removing the assertion itself from the code, arguing that “it helps document preconditions”, or “you never know if something in the calling code might change”.)

Word Aligned

sweating the small stuff