Python’s lesser known loop control

2013-01-14Comments

I’ll break out of a loop if I have to but generally prefer to recast code so no break is needed. It’s not about avoiding the keyword; but rather that the loop control expression should tell readers when and why the loop exits.

In C and C++ such recasting is rarely a problem. Python separates statements and expressions which makes things more difficult. You can’t assign to a variable in a loop control expression, for example. Consider a function which processes a file one chunk at a time, until the file is exhausted.

while True:
    data = fp.read(4096)
    if not data:
        break
    ...

The control expression, while True, suggests an infinite loop, which isn’t what actually happens, but readers must poke around in the loop body to find the actual termination condition.

As already mentioned, an assignment statement isn’t an expression, so we can’t write this:

Syntax error!
while data = fp.read(4096):
    ...

You could implement a file reader generator function which yields chunks of data, allowing clients to write:

for data in chunked_file_reader(fp):
    ...

This at least localises the problem to chunked_file_reader().

Another solution is to use the two argument flavour of iter, iter(object, sentinel). Here, object is a callable and sentinel is a terminal value. Object is called with no arguments: use functools.partial to set the chunk size passed to file.read; and stop when this function returns the empty string.

import functools

chunked_file_reader = functools.partial(fp.read, 4096)

for data in iter(chunked_file_reader, ''):
    ...