Find the average of a collection of tuples or dicts using Python
You’ve been running some tests, each of which returns a 3-tuple of numerical results — (real, user, sys)
times, maybe — and you’d like to combine these into a single 3-tuple, the average result.
Easy!
def average(times): N = float(len(times)) return (sum(t[0] for t in times)/N, sum(t[1] for t in times)/N, sum(t[2] for t in times)/N)
If you want a more generic solution, one which works when the tuples might have any number of elements, you could do this:
def average(xs): N = float(len(xs)) R = len(xs[0]) return tuple(sum(x[i] for x in xs)/N for i in range(R))
or this:
def average(xs): N = float(len(xs)) return tuple(sum(col)/N for col in zip(*xs))
The second generic variant uses zip to transpose its inputs.
Now suppose we have keyed collections of results which we want to average:
>>> times = [{'real': 34.4, 'user': 26.2, 'sys': 7.3}, {'real': 28.7, 'user': 21.5, 'sys': 6.4}, {'real': 29.3, 'user': 22.0, 'sys': 6.9}]
If, as in the example above, each result has the same set of keys, the average result could be calculated like this:
>>> N = float(len(times)) >>> { k : sum(t[k] for t in times)/N for k in times[0] } {'real': 30.8, 'sys': 6.9, 'user': 23.2}
What if the inputs don’t have the same keys? Consider the contents of four fridges.
>>> fridges = [ { 'egg': 5, 'milk': 1.700, 'sausage': 6 }, { 'beer': 6, 'milk': 0.568, 'egg': 1 }, { 'egg': 3, 'sausage': 4, 'milk': 0.125, 'lettuce': 1 }, { 'carrot': 4 }]
A Counter can collect and calculate the average fridge contents.
>>> from collections import Counter >>> total = sum(map(Counter, fridges), Counter()) >>> N = float(len(fridges)) >>> { k: v/N for k, v in total.items() } {'sausage': 2.5, 'lettuce': 0.25, 'beer': 1.5, 'carrot': 1.0, 'egg': 2.25, 'milk': 0.59825}
Note that although Counter
s were primarily designed to work with positive integers to represent counts, there’s nothing stopping us from using floating point numbers (amount of milk in our example) in the values field.