Maximum of an empty sequence?

2009-03-03, Comments

So it happened: as of Python 3.0, reduce() is no longer a built-in function. In the “What’s New?” Guido van Rossum can’t resist firing a parting shot.

Removed reduce(). Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable.

Take that!

As I’ve noted before, reduce can be side-lined in this way without causing pain because other built-ins cover the common reductions:

  • sum
  • all, any
  • max, min

and the join method concatenates built-in string and bytes types.

These standard functions are flexible enough to work on any iterable, be it an in-memory sequence like a list, or a stream generated one element at a time. Beware the boundary case! What if an iterable generates no elements? We can’t determine its length up front, and we don’t want to pull it all into memory at once just to find out if it’s empty.

Let’s fire up a Python interpreter to experiment. Happily lambda survived the version 3.0 transition, and we can use it to build a mini factory function for empty streams.

>>> zs = lambda: iter(set())
>>> zs()
<set_iterator object at 0x6f49f8>

(Incidentally, have you discovered Python 3.0’s new set literal syntax? For example, {True, False} is the set of boolean values. Sadly {} creates an empty dict, just like it always did, and not an empty set, ∅.)

Can we reduce these empty iterables? No problem!

>>> all(zs())
>>> any(zs())
>>> '!?'.join(zs())
>>> sum(zs())

Hang on though!

>>> max(zs())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: max() arg is an empty sequence

So what exactly did we expect the maximum value of an empty sequence to be? The only plausible answer is to bounce the question back to clients and allow them to supply a default. Since the built-in max function doesn’t allow this, we’d need to write something like:

>>> def maximum(iterable, default):
...     '''Like max(), but returns a default value if xs is empty.'''
...     try:
...         return max(iterable)
...     except ValueError:
...         return default
>>> maximum(zs(), -1)

Alternatively, the recently demoted reduce does admit an initial value.

>>> from functools import reduce, partial
>>> maximum = partial(reduce, max)
>>> maximum(range(42))
>>> maximum(zs(), -1)

I guess I should point out that the final version of maximum() repeatedly calls the two argument flavour of max(), and may prove suboptimal for large sequences.

This may all seem trivial, but it’s an issue I really did encounter recently — stay tuned for details. I’m not convinced Python gets things right, so I had a quick look at the support built into other languages. Some avoid the problem, only offering a two argument version of max(). Algorithms in the standard C++ library typically deal with half-open iterator ranges, and the range end forms a natural sentinel which std::max_element() can return given an empty range. Perl also returns a sentinel value if max is called on an empty list.