Tracing function calls using Python decorators

2008-03-10, , Comments

Introduction

Let’s suppose you want to trace calls to a function. That is, every time the function gets called, you want to print out its name and the values of any arguments passed by the caller. To provide a concrete example, here’s a function which calculates the dot product of two vectors.

def dot(v, w):
    """Calculate the dot product of two vectors.

    Example:
    >>> dot((1, 2, 3), (3, 2, 1))
    10
    """
    return sum(vv * ww for vv, ww in zip(v, w))

To trace calls to the function you could just edit it and insert a print statement.

def dot(v, w):
    print "dot(v=%r, w=%r)" % (v, w)
    return sum(vv * ww for vv, ww in zip(v, w))

When you no longer want calls traced you can remove the print statement or even comment it out. This approach works well enough for a while but you soon discover there are more functions you want to trace; and you’ll eventually end up with lots of functions being traced and lots of commented-out tracing code. You might even end up with broken commented-out code:

broken comment
def dot(vec1, vec2):
    # print "dot(v=%r, w=%r)" % (v, w)
    return sum(v1 * v2 for v1, v2 in zip(vec1, vec2))

At this point, you realise that calling a function and tracing these calls are orthogonal operations. Isn’t there a less invasive way to do this?

A less invasive way

Rather than change the original function you can simply wrap it with the code which prints out the inputs.

Echo a function, Take 1
def echo_both_args(fn):
    "Returns a traced version of the input 2-argument function."
    def wrapped(arg1, arg2):
        name = fn.__name__
        print "%s(%r, %r)" % (name, arg1, arg2)
        return fn(arg1, arg2)
    return wrapped

def dot(vec1, vec2):
    ....

dot = echo_both_args(dot)

The echo_both_args function accepts a single function as a parameter and returns a new function which wraps this original function by adding the desired trace functionality. By the way, I’ve introduced the term “echo” for this action rather than the more conventional “trace” since Python already has a trace module which does something rather different. The idea is that when you call a function, you hear your call repeated — like an echo.

C/C++ programmers will have noticed the inner function object, wrapped, which echo_both_args returns. Returning inner functions is a common Python idiom, and is the way language implements the closures found in other high-level languages such as Scheme or Ruby. The inner function doesn’t go out of scope — in Python, objects persist as long as needed.

The final line of the code snippet simply rebinds dot to the echoed version of itself. We don’t need to pass the name of the function to be traced (dot in this example) into echo_both_args since in Python a function object carries its name around with it in a __name__ attribute. If we now call dot in an interpreted session, we’ll see the call echoed:

>>> dot((1, 2, 3), (3, 2, 1))
dot((1, 2, 3), (3, 2, 1))
10

The inner function, wrapped, should be capable of wrapping any function — so long as that function accepts just two parameters, that is. (Note the use of the %r print formatter which converts any Python object into its canonical string representation). If we want a more general purpose echo which can wrap any function with any signature, we might try something like the following:

Echo a function, Take 2
def echo(fn):
    "Returns a traced version of the input function."
    from itertools import chain
    def wrapped(*v, **k):
        name = fn.__name__
        print "%s(%s)" % (
            name, ", ".join(map(repr, chain(v, k.values()))))
        return fn(*v, **k)
    return wrapped

Here, the inner function, wrapped, has parameters (*v, **k), which contain the positional and keyword arguments passed to this function respectively. It doesn’t really matter how many arguments the function being wrapped has, or indeed if this function itself accepts arbitrary positional and keyword arguments: it just works — have a look in the Python reference manual for details. We’ve use chain, one of the handy iterator tools from the itertools module, to iterate through all the positional and keyword argument values; then a combination of string.join and the map and repr builtins produce a string representation of these values.

We can now define, wrap and call some functions with more varied signatures.

def f(x): pass
def g(x, y): pass
def h(x=1, y=2): pass
def i(x, y, *v): pass
def j(x, y, *v, **k): pass

f, g, h, i, j = map(echo, (f, g, h, i, j))

f(10)
g("spam", 42)
g(y="spam", x=42)
h()
i("spam", 42, "extra", "args", 1, 2, 3)
j(("green", "eggs"), y="spam", z=42)

Running this code produces output:

f(10)
g('spam', 42)
g('spam', 42)
h()
i('spam', 42, 'extra', 'args', 1, 2, 3)
j(('green', 'eggs'), 'spam', 42)

This does something like what we want, but note a couple of problems:

  • the second call to g uses keyword argument syntax, and the output g('spam', 42) is exactly what we saw echoed by the previous call to g even though the arguments have been swapped around. We’d better try and echo argument names as well as their values (just as we did when we first modified dot to echo calls).
  • the output doesn’t show the (defaulted) arguments passed to h 1.

There’s another more subtle problem too.

>>> def f(x):
...     " I'm a function called f. "
...     pass
...
>>> f.__name__
'f'
>>> f.func_doc
" I'm a function called f. "
>>> f = echo(f)
>>> f.__name__
'wrapped'
>>> f.func_doc
None

That is, by wrapping f, we’ve tinkered with its identity: its name has changed and its documentation has disappeared. This is bad news in a dynamic language, since any code introspecting f will get the wrong idea about it. We should really have modified wrapper to copy various attributes from f.

Decorators

At Python 2.3, the @ character was pulled out of the bag to provide a special syntax for decorating functions in this way. Rather than retrospectively rebinding a function f as shown in the examples so far, we can explicitly decorate f up front with the echo wrapper like this:

using decorator syntax
>>> def echo(fn):
...     from itertools import chain
...     def wrapped(*v, **k):
...         name = fn.__name__
...         print "%s(%s)" % (name, ", ".join(map(repr, chain(v, k.values()))))
...         return fn(*v, **k)
...     return wrapped
...
>>> @echo
>>> def f(x): pass
...
>>> f('does it work?')
f('does it work?')

Of course decoration isn’t reserved for the particular use case shown in this article: we might decorate functions to time their execution, to protect them against simultaneous access from multiple threads, to check the validity of input arguments, to dynamically analyse code coverage, and so on.

Note also that there’s a special decorator in the functools module which does the job of making a wrapper look like its wrappee.

>>> import functools
>>> def echo(fn):
...     @functools.wraps(fn)
...     def wrapped(*v, **k):
...         ....
...    return wrapped
...
>>> @echo
>>> def f(x):
...    " I'm f, don't mess with me! "
...    pass
>>> f.__name__
'f'
>>> f.func_doc
" I'm f, don't mess with me! "
>>> f(('spam', 'spam', 'spam!'))
f(('spam', 'spam', 'spam!'))

Improving echo

Here’s a version of echo which fixes the remaining problems: that is, it prints out argument names as well as argument values, and it also prints out any arguments defaulted in the call to the wrapped function. This version is a little longer and considerably more fiddly 2, since we need to dig deeper into the function’s code, but it follows the same basic pattern as its predecessors.

While we’re improving things, note that we’ve also parameterised the write function, rather than naively hijacking sys.stdout to print out function calls.

Echo a function, Take 3
import sys

def format_arg_value(arg_val):
    """ Return a string representing a (name, value) pair.

    >>> format_arg_value(('x', (1, 2, 3)))
    'x=(1, 2, 3)'
    """
    arg, val = arg_val
    return "%s=%r" % (arg, val)

def echo(fn, write=sys.stdout.write):
    """ Echo calls to a function.

    Returns a decorated version of the input function which "echoes" calls
    made to it by writing out the function's name and the arguments it was
    called with.
    """
    import functools
    # Unpack function's arg count, arg names, arg defaults
    code = fn.func_code
    argcount = code.co_argcount
    argnames = code.co_varnames[:argcount]
    fn_defaults = fn.func_defaults or list()
    argdefs = dict(zip(argnames[-len(fn_defaults):], fn_defaults))

    @functools.wraps(fn)
    def wrapped(*v, **k):
        # Collect function arguments by chaining together positional,
        # defaulted, extra positional and keyword arguments.
        positional = map(format_arg_value, zip(argnames, v))
        defaulted = [format_arg_value((a, argdefs[a]))
                     for a in argnames[len(v):] if a not in k]
        nameless = map(repr, v[argcount:])
        keyword = map(format_arg_value, k.items())
        args = positional + defaulted + nameless + keyword
        write("%s(%s)\n" % (name(fn), ", ".join(args)))
        return fn(*v, **k)
    return wrapped

Let’s give it a try.

@echo
def f(x): pass
@echo
def g(x, y): pass
@echo
def h(x=1, y=2): pass
@echo
def i(x, y, *v): pass
@echo
def j(x, y, *v, **k): pass
class X(object):
    @echo
    def m(self, x): pass
    @classmethod
    @echo
    def cm(klass, x): pass

def reversed_write(s): sys.write(''.join(reversed(s)))
def k(**kw): pass
k = echo(k, write=reversed_write)

f(10)
g("spam", 42)
g(y="spam", x=42)
h()
i("spam", 42, "extra", "args", 1, 2, 3)
j(("green", "eggs"), y="spam", z=42)
X().m("method call")
X.cm("classmethod call")
k(born="Mon", christened="Tues", married="Weds")

Running the preceding code produces:

f(x=10)
g(x='spam', y=42)
g(y='spam', x=42)
h(x=1, y=2)
i(x='spam', y=42, 'extra', 'args', 1, 2, 3)
j(x=('green', 'eggs'), y='spam', z=42)
m(self=<__main__.X instance at 0x7837d8>, x='method call')
cm(klass=<class __main__.X at 0x785840>, x='classmethod call')
)'seuT'=denetsirhc ,'sdeW'=deirram ,'noM'=nrob(k

As you can see, it works on methods and classmethods too.

By the way, if you’re not familiar with classmethods, they’re created using Python’s built-in classmethod function, which transforms functions in the required way; and, as shown in the example above, the decorator syntax is the recommended way of applying this transformation.

Wider application

Using this final version of echo to decorate functions is simple enough. All you have to do is precede the function(s) you want to echo with the echo decorator. What, though, if these functions are in a module you don’t want to modify — one of the standard library modules, for example? What if you want to echo an entire class by echoing all of its methods without editing the class itself; or all the classes and functions in a module, again without editing the module itself?

A little introspection works the required magic. All we have to do is work our way through all the functions and classes in a module, and all the methods in these classes, rebinding these functions to their echoed versions. Extra attention is needed for class-, static- and private- methods, but otherwise this code contains few surprises.

echo, echo, echo!
import inspect
import sys

def name(item):
    " Return an item's name. "
    return item.__name__

def is_classmethod(instancemethod):
    " Determine if an instancemethod is a classmethod. "
    return instancemethod.im_self is not None

def is_class_private_name(name):
    " Determine if a name is a class private name. "
    # Exclude system defined names such as __init__, __add__ etc
    return name.startswith("__") and not name.endswith("__")

def method_name(method):
    """ Return a method's name.

    This function returns the name the method is accessed by from
    outside the class (i.e. it prefixes "private" methods appropriately).
    """
    mname = name(method)
    if is_class_private_name(mname):
        mname = "_%s%s" % (name(method.im_class), mname)
    return mname

def format_args(args):
    ...

def echo(fn, write=sys.stdout.write):
    ...

def echo_instancemethod(klass, method, write=sys.stdout.write):
    """ Change an instancemethod so that calls to it are echoed.

    Replacing a classmethod is a little more tricky.
    See: http://www.python.org/doc/current/ref/types.html
    """
    mname = method_name(method)
    never_echo = "__str__", "__repr__", # Avoid recursion printing method calls
    if mname in never_echo:
        pass
    elif is_classmethod(method):
        setattr(klass, mname, classmethod(echo(method.im_func, write)))
    else:
        setattr(klass, mname, echo(method, write))

def echo_class(klass, write=sys.stdout.write):
    """ Echo calls to class methods and static functions
    """
    for _, method in inspect.getmembers(klass, inspect.ismethod):
        echo_instancemethod(klass, method, write)
    for _, fn in inspect.getmembers(klass, inspect.isfunction):
        setattr(klass, name(fn), staticmethod(echo(fn, write)))

def echo_module(mod, write=sys.stdout.write):
    """ Echo calls to functions and methods in a module.
    """
    for fname, fn in inspect.getmembers(mod, inspect.isfunction):
        setattr(mod, fname, echo(fn, write))
    for _, klass in inspect.getmembers(mod, inspect.isclass):
        echo_class(klass, write)

Work in progress

I’ve used the echo library successfully on a number of occasions. Usually, I simply want to capture all function calls made to a module. Echoing the module generates a log file which I can then examine offline using the standard Unix shell tools.

I’ve resisted the urge to add options to fine-tune exactly which functions get echoed, or to control whether or not private methods get echoed — I prefer to generate a surplus of information then filter it as needed. (Though note that the code, as it stands, doesn’t echo nested functions and classes.)

I discovered that if a class customises the special __repr__ method, then trying to echo calls to this method leads to recursion and a runtime error. As a consequence, echo avoids echoing __repr__ and, for good measure, __str__ as well. I suspect that echo would lead to similar problems if applied to another introspective module. Echoing doctest wouldn’t be very clever, and echoing echo itself is surely doomed.

Source Code

The code for this article is available using anonymous SVN access at http://svn.wordaligned.org/svn/etc/echo (browse the code online).

Credits

My thanks to Dan Tallis, Kevlin Henney, Marius Gedminas and all at CVu for their help with this article.


@ end

Other languages also lay claim the to the term “decorator”, and indeed use of the @ symbol for this purpose. For example, a statically-typed language such as C++ can utilise the Decorator pattern to adapt objects at runtime: the classic example would be a window in a graphical user interface, which may be dynamically decorated with (combinations of) borders, scroll-bars, and so on.

Java makes special use of the @ symbol in its annotations, a language feature related to Python’s decorators. You’ll also find the @ symbol used in Java comments, where it augments the code without changing its runtime-behaviour, enabling the Javadoc tool to generate better documentation. I’ve seen this idea stretched further by the XDoclet family of tools which perform tag-based tricks on Javadoc-style comments in order to generate boiler-plate code — though in this case my response is closer to bewilderment than wonder.

Elevating the concept of decoration further brings us to aspect-oriented programming, a paradigm which provides language/framework support for the separation of concerns — and if that sounds like a collection of buzzwords, it’s because I don’t know anything about it!

For some readers, Python’s ability to tamper with classes and functions in ways unforeseen by their original implementer may seem dangerous, and I regard this a fair reaction. I’ve heard it argued that languages like Java and C++ are better suited for large projects, primarily because such languages simply don’t allow you to write code such as change_everything(world) — which is really what echo_module(module) does.

Meta-programming techniques like these need taste and restraint. As we’ve shown, in Python, the @ symbol is pure syntactic sugar; sweet as it may seem, we should use it to improve, not impair.


1 Strictly speaking, the default arguments aren’t “passed” to the function; they’re stored by the function object when the function code is parsed.

2 I suspect that inside this rather contorted function a simpler version is begging to get out. I’d be pleased to accept any suggestions.