PrevUpHomeNext

Reflection and Introspection

Validating Python Documentation

Take a look at the following Python function which on my machine lives in <PYTHON_ROOT>/Lib/pickle.py:


def encode_long(x):
   r"""Encode a long to a two's complement little-endian binary string.
   Note that 0L is a special case, returning an empty string, to save a
   byte in the LONG1 pickling context.
    >>> encode_long(0L)
   ''
   >>> encode_long(255L)
   '\xff\x00'
   >>> encode_long(32767L)
   '\xff\x7f'
   >>> encode_long(-256L)
   '\x00\xff'
   >>> encode_long(-32768L)
   '\x00\x80'
   >>> encode_long(-128L)
   '\x80'
   >>> encode_long(127L)
   '\x7f'
   >>>
   """
   ....

The triple quoted string which follows the function declaration is the function's docstring (and the r which prefixes the string makes this a raw string, ensuring that the \'s which follow are not used as escape characters). This particular docstring provides a concise description of what the function does, fleshed out with some examples of the function in action. These examples exercise special cases and boundary cases, rather like a unit test might.

Python's doctest module enables a user to test that these examples work correctly. Here's how to doctest pickle in an interactive Python session:


>>> import pickle
>>> import doctest
>>> doctest.testmod(pickle)
(0, 14)

The test result, (0, 14), indicates 14 tests have run with 0 failures. For more details try doctest.testmod(pickle, verbose=True). In case anyone is confused, 7 of the tests apply to encode_long – and unsurprisingly the other 7 apply to decode_long.

Incidentally, if pickle.py is executed (rather than imported as a library) it runs these tests directly.

The doctest module is a metaprogram – an example of Python being used to both read and execute Python. To see how it works I suggest taking a look at its implementation. The code runs to about 1500 lines of which the majority are documentation and many of the rest are to do with providing flexibility for more advanced use.

In essence, note that docstrings are not a comments, they are formal object attributes. Now, Python allows you to list and categorise objects at runtime, so we can collect up the docstrings for classes, class methods and for the module itself. Once we have all these docstrings we can search them to find anything which looks like the output of an interactive session using Python's text parsing capabilities. The remaining twist is Python's ability to dynamically compile and execute source code using the compile and exec commands. So, we can replay the documentation examples, capturing and checking the output.

The doctest module provides no more than an introduction to metaprogramming in Python. Given a Python object it is possible to get at the object's class, which is itself an object which can be dynamically queried and even modified at run-time. This isn't the sort of trick which is often required: I haven't tried it myself so I'd better keep quiet and refer you to the experts. See for example van Rossum or Raymond.

Copyright © 2005 Thomas Guest

PrevUpHomeNext