Brackets Off!

2004-05-06, , , Comments

The mathematical formula:

v = u + at

calculates the speed, v, of an object, with initial speed u and constant acceleration a, after time t. Placing the a next to the t is a convenient shorthand for “multiply a by t”, which also makes it apparent that the multiplication must be done before the addition.

When the same formula is written in C, the multiplication operator needs explicit representation:

v = u + a * t;

The layout of this expression no longer makes it clear that the multiplication should be done before the addition, so a programmer might choose to parenthesise:

v = u + (a * t)

Are these parentheses required to guarantee correct evaluation of v? If not, should they be included anyway, to help convey the meaning of the expression? How can coding standards help with such choices?

This article aims to answer these questions. It first presents some examples of the operator precedence and associativity rules in action, then offers some guidelines on when to parenthesise expressions, and finally argues that these guidelines should be replaced by a single rule.

Examples

v = u + a * t,
x = 8 - 4 - 2,
r = h << 4 + 1,
str += ((errors == 0) ? "succeeded" : "failed"),
*utf++ = 0x80 | ucs >> 6 & 0x6f;

Our first example, v = u + a * t contains three operators: assignment, addition and multiplication. These operators — indeed all operators — follow a strict precedence which defines the order of evaluation. Since multiplication has higher precedence than addition, which in turn has higher precedence than assigment, the expression is equivalent to:

v = (u + (a * t))

This means the compiler can be trusted with the expression as first presented. No parentheses are required. Good, the language does what we expect.

In the second example, subtraction binds more tightly (i.e. has higher precedence than) assignment, so the subtractions are performed first. Since all the arithmetic operators associate left to right the expression is equivalent to:

x = ((8 - 4) - 2)

In r = h << 4 + 1 the arithmetic addition operator binds more tightly than the shift operator, so the expression is equivalent to:

r = (h << (4 + 1))

Why did the programmer not write r = h << 5? Probably because the intention was:

r = (h << 4) + 1

but bit shifting (like, say, finding the address of something, or subscripting an array) somehow seems closer to the machine and feels as if it ought to be of higher precedence than addition, so the crucial parentheses were missed [1].

The parentheses are unneccessary in our fourth example. We could rewrite:

str += ((errors == 0) ? "succeeded" : "failed")

as:

str += errors == 0 ? "succeeded" : "failed"

since the comparison operators bind more tightly than the conditional operator, which in turn binds more tightly than the assignment operators. Do the parentheses help you understand the meaning of this expression? Would you have left them out — and if so, would one of your team-mates have complained?

How should the fifth example be parenthesised, to make its meaning clear? It is equivalent to:

(*(utf++)) = (0x80 | ((ucs >> 6) & 0x6f))

which shows how complicated an expression looks when parentheses are added indiscriminately.

Coding Standards

In general — at least, in my experience — coding standards do not provide rules on how to parenthesise expressions. I suspect this is for two reasons.

Firstly, because although all programmers use parentheses to clarify the meaning of expressions, they may well disagree on what makes an expression clear. Clarity seems a matter of taste. While programmers in a team may agree (to differ) on whether tabs or spaces are to be used for indentation, their coding standard leaves them free to rewrite Example 4 as:

str += errors == 0 ? "succeeded" : "failed"

And secondly, if a coding standard were to rule on how to parenthesise, it would be difficult to find a middle ground. This leaves as candidate rules the two extremes:

  1. parenthesise everything
  2. never parenthesise

The first quickly leads to unreadable code. The second seems overly proscriptive.

In the absence of a hard rule, here are some guidelines which I hope are non-contentious and which may help us reach a conclusion:

  • have the operator precedence tables to hand and understand how to interpret expressions using them
  • understand the logic behind the operator precedence tables, but be aware of the traps and pitfalls
  • remember, parentheses are not the only way to make order of evaluation clear. For example, our fourth example could be rewritten:
*utf++ = 0x80 | 
              ucs >> 6 & 
              0x6f

  • or even:
*utf = ucs >> 6; 
*utf &= 0x6f; 
*utf |= 0x80; 
++utf;

  • if an expression is hard to understand, break it down into simpler steps, or extract it out as a function with a meaningful name
  • trust the compiler: it might not implement partial template specialisation correctly, but it will get operator precedence right every time
  • never use parentheses simply because you aren’t sure of how an expression will be evaluated without them: treat doubt as an opportunity to learn
  • all macro arguments must be parenthesised.

Concluding Thoughts

Any effort put into becoming familiar with precedence tables is likely to pay off across a range of languages. For example, although C++ introduces several new operators over C, there are no surprises. The precedence rules remain in force even if the operators have been overloaded (but that’s the subject of another article). Java operator precedence is almost a subset of C’s. Similarly, scripting languages are generally compatible with C, even where C’s precedence rules are slightly [2]. So, while Perl introduces lower precedence versions of the logical operators not, and, and or, it ensures that not binds more tightly than and which in turn binds more tightly than or. Interestingly, in Python, where whitespace is syntactically significant, parentheses can be used not just to indicate order of evaluation, but also to wrap lengthy expressions over several lines.

The more experienced I become as programmer, the fewer parentheses I use. Coming from a mathematical background, it was several months into my first job before I dared use the conditional operator — and when I finally did start using it, I parenthesised all the sub-expressions for safety. Later on in my career, when I first found myself working with the bitwise operators, again, I enclosed sub-expressions with brackets. As my confidence has increased, the brackets have peeled away.

This, though, is simply evolution. Familiarity with the languages you use makes it easier to read expressions without the unnecessary noise of parentheses. Evolving in this way, however, leaves a programmer vulnerable when working on code written by a more experienced team-mate, unless the experienced programmer writes to a lowest common denominator.

Surely it would be better for everyone to raise their game. The operator precedence tables are a fundamental part of the language. The rules for using them are simple. Although there are many precedence levels, the operators do group logically. Update your coding standards. Prohibit unnecessary parentheses. Brackets off!


Notes

Book cover

[1] I’ve taken this example directly from Andrew Koenig’s “C Traps and Pitfalls”. This is a nice little book which expands on the ideas presented in a paper of the same name [PDF].

[2] According to Koenig, some of C’s peculiarities can be blamed on its heritage:

The precedence of the C logical operators comes about for historical reasons. B, the predecessor of C, had logical operators that corresponded roughly to C’s & and | operators. Although they were defined to act on bits, the compiler would treat them as && and || if they were in a conditional context. When the two usages were split apart in C, it was deemed too dangerous to change the precedence much.

This article first appeared appeared in C Vu 15.6, and I am grateful to all at C Vu for their help.