Syntactic Sugar

2008-09-04, , , Comments

SGI logo

The SGI STL documentation notes that std::map::operator[] is redundant:

Strictly speaking, this member function is unnecessary: it exists only for convenience.

This sentence pretty much nails what’s meant by “syntactic sugar”, defined more formally in Wikipedia as:

… a term coined by Peter J. Landin for additions to the syntax of a computer language that do not affect its functionality but make it “sweeter” for humans to use.

Cancer of the semicolon

For example, Python’s @decorator syntax sweetens the practice of wrapping functions. In C++, operator overloading adds nothing which couldn’t be achieved using standard function call syntax, but it opens the door to some inventive and expressive techniques. Haskell has a nice syntax for custom infix operators, and so on.

Lisp values its spare syntax and certainly won’t allow infix operators, but even a minimalist dialect like Scheme allows lists to be represented like this (a b c d e) rather than (a . (b . (c . (d . (e . ()))))).

Perl comes laced and frosted with syntactic sugar. Larry Wall explains why he doesn’t heed Alan Perlis’s famous warning about cancer of the semicolon.

To me, one of the most agonizing aspects of language design is coming up with a useful system of operators. To other language designers, this may seem like a silly thing to agonize over. After all, you can view all operators as mere syntactic sugar — operators are just funny looking function calls. Some languages make a feature of leveling all function calls into one syntax. As a result, the so-called functional languages tend to wear out your parenthesis keys, while OO languages tend to wear out your dot key. — Larry Wall, Apocalypse 3, Operators

(Perl, of course, tends to wear out your shift key …)

There’s not much in C++ you couldn’t manage using C, given enough effort. I both use and bitch about C++ and sometimes it’s good to get back to the plain and wholesome taste of C. Not for long though: I start to miss the standard containers and algorithms, operator overloading, exceptions etc.

I mention this because I’ve been thinking about what makes good software documentation. If I’m working with the C standard library, I tend to refer to the man pages for help, and very fine they are too. Apparently man pages for the C++ standard library exist but they don’t come as standard on my platform and I haven’t installed them — instead I refer to the SGI STL Programmer’s Guide, which I’ve downloaded locally for instant and offline access. Now, it’s clear this guide hasn’t been actively maintained (the “What’s New” page lists nothing more recent than June 2000): it includes non-standard extensions and in some places I’ve found it’s no longer correct1. Nonetheless, its clear exposition and clean layout make it my first point of reference.

C++’s standard map container bears a superficial resemblance to Python’s dict, associating keys with values (under the surface, a map is a balanced tree and a dict is a hash table2). Like Python, the [] operator syntax can be used for container access. Unlike Python, accessing the value associated with a key not present in the map succeeds — the key gets added and the value is default constructed[3]. Here’s how the SGI guide documents this overloading of operator[] (emphasis mine).

Since operator[] might insert a new element into the map, it can’t possibly be a const member function. Note that the definition of operator[] is extremely simple: m[k] is equivalent to (*((m.insert(value_type(k, data_type()))).first)).second. Strictly speaking, this member function is unnecessary: it exists only for convenience.

Should reference documentation entertain? This paragraph may not have had me rolling on the floor with laughter, but I did snort in my coffee.

Here’s some compact C++ code designed to showcase the convenience of the operator overload, in this case calculating word frequencies in an istream:

typedef std::string Word;
typedef std::map<Word, int> WordCounts;

void count_words(std::istream & text, WordCounts & word_counts)
{
    Word word;
    while (text >> word)
    {
        ++word_counts[word];
    }
}

Using the supplied definition of operator[], the loop reads:

    while (text >> word)
    {
        ++(*((word_counts.insert
              (WordCounts::value_type(word, WordCounts::mapped_type()))
              .first))).second;
    }

Now, I realise you have to be clever to use C++, and I’m sure there’s a better way of laying out this expression, but can really anyone find it extremely simple?

Syntactic sugar. Pour it on!

sugar


1 For example, in the quoted documentation for operator[], data_type() should now read mapped_type().

2 TR1 at last brings a standard hash table to C++, which goes under the unlikely name of std::tr1::unordered_map.

[3] Python provides collections.defaultdict if you want this behaviour.