Sledgehammers vs Nut Crackers

2016-02-23, Comments


I get awk and can read awk programs but find the language of little use. Its focus is narrow and its syntax can be surprising. Python matches it at home and smashes it away. Nonetheless, a number of the advent of code puzzles fit the awk processing model — line based instructions, the interpretation of which builds state contributing to the final result — and I enjoyed writing awk solutions. There’s satisfaction in using a tool which is up to the job, no more and no less: in using a nutcracker, rather than a sledgehammer, to crack a nut.


For example, the puzzle set on day 6 consists of a list of instructions for switching and toggling a 1000 x 1000 grid of lights. The instructions read:

turn on 296,50 through 729,664
turn on 212,957 through 490,987
toggle 171,31 through 688,88
turn off 991,989 through 994,998

and the question is, after following these instructions, how many lights are lit?

Each instruction is a single line; the actions — turn on, turn off, toggle — can be matched by patterns; and to follow these actions requires no more than an array and arithmetic: awk fits nicely.



Here, the BEGIN action sets the field separator FS to the regular expression [ ,] which picks out the textual and numeric fields from each instruction. Awk is highly dynamic: use a variable as a number and it becomes a number, in this case the coordinates of a lighting grid; and similarly, the fields “on”, “off” and “toggle” are matched and treated as strings.

The grid of lights appears to be represented as a two dimensional array, accessed lights[x,y] rather than lights[x][y]. In fact, the array — like all arrays in awk — is an associative container, which maps from strings — not numbers — to values. The key x,y becomes a string which joins the values of x and y with a separator defaulted to "\034".


The escape character at the end of line 5 is needed to continue the long line. I’d prefer to use parentheses to wrap the expression over more than one line, as I would in Python, but this trick doesn’t seem to work. I was somewhat surprised there was no built in sum() function to count up the number of lights turned on by the END. It would have been cute to pass on(), off() and toggle() as functions into switch(), separating traversal from action, but I couldn’t find a way to do this in awk.

My awk script solved the puzzle in 45 seconds. A Python solution took 17 seconds. I didn’t try optimising either.

Don’t use a sledgehammer to crack a nut!

This advice, commonly given to programmers, demands explanation. If it’s intended to imply a sledgehammer is more likely to pulverise the nut than open it, then fine, that’s true — but the analogy fails in this case: a solution written in Python would have been equally correct.

Alternatively, if we mean you shouldn’t use a powerful language when a less powerful one would do, then the question becomes: why not? Python is a general purpose programming language. It can crack nuts, peel bananas, serve web pages and so much more. If you know Python why bother with Awk?

At the outset of this post I admitted I don’t generally bother with awk. Sometimes, though, I encounter the language and need to read and possibly adapt an existing script. So that’s one reason to bother. Another reason is that it’s elegant and compact. Studying its operation and motivation may help us compose and factor our own programs — programs far more substantial than the scripts presented here, and in which there will surely be places for mini-languages of our own.