Fixing Compiler Warnings the Hard Way

2008-05-20 • C, Build, ACCU • Comments

GCC makes a suggestion

The build server CC’d me on an ASBO email. Good old GCC, grumbling about operator precedence again. But Hey! — at least it had a positive suggestion to make.

From: buildmaster@example.com
To: lem.e.tweakit@example.com
Cc: developers@example.com
Subject: Broken build
------------------------------------
Anti Social Build Order
Version: svn://svnserver/trunk@999
Platform: Linux, GCC 4.0.1
Build Log: 
....
Warning: suggest parentheses around arithmetic
in operand of ^

I looked at the code. Here’s a much simplified version which generates the same warning.

void
unpack(unsigned char const * bits, int n_bits,
       unsigned char * buf)
{
    unsigned char bit, byte, pos;
    int b;

    for (b = 0; b != n_bits; ++b)
    {
        byte = bits[b / 8];
        pos = 7 - (b % 8);
        bit = byte & 2^pos;
        buf[b] = bit == 0 ? 0 : 255;
    }
}

When compiling this function GCC warns about the line in bold type:

$ gcc -Wall -c unpack_bits.c 
unpack_bits.c: In function `unpack':
unpack_bits.c:12: warning: suggest parentheses around 
             arithmetic in operand of ^

Setting a precedent

Needless to say, the actual offending code was buried in a longer function, indented more deeply, and with a few more indirections — so it was indeed tempting to take GCC’s advice and whack in a couple of brackets. Clearly the author meant to write:

bit = byte & (2^pos);

Why else omit spaces around the ^?

Fortunately I live by my own rule, to avoid unnecessary parentheses, so I wasn’t about to add any here without asking why. Worse than my stubborn principles, ^, the exclusive or operator, has lower precedence than bitwise and, &, so to keep GCC happy and retain the original behaviour we’d have to write:

bit = (byte & 2) ^ pos;

This last expression looks very bizarre. Had it ever been exercised?

GCC was right, the code was wrong, but its diagnostic showed the wrong way to right things. On this occasion GCC should have been proscriptive, not prescriptive, and left the fix in the hands of the programmer 1.

Don’t mix bits and arithmetic

0000 0000 5589 e58b 5508 89d0 d1e8 2555
5555 5529 c289 d0c1 e802 2533 3333 3381
e233 3333 3301 d089 c2c1 ea04 01d0 250f
0f0f 0f89 c2c1 ea08 01d0 25ff 00ff 0089
c2c1 ea10 01d0 25ff ff00 005d c300 0000
5555 5555 3333 3333 0f0f 0f0f ff00 ff00

My personal rule of thumb is to avoid mixing bitwise and arithmetic operations. Although integral types support both kinds of operation, it generally feels like a type-mismatch to combine them in a single expression. An array of bits isn’t the same as a number, and vice-versa.

Of course there are some treasured bit-twiddling tricks which exploit the mapping between binary arithmetic and machine level register operations. So we can, for example, calculate 2 raised to the power of 19 with a simple left-shift, 1 << 19, or test if v is a power of 2 with !(v & (v - 1)) && v. I’m not suggesting we blacklist these ingenious hacks — in fact, anyone off to an interview for a programming job with an embedded systems company might do well to study them — but I would say their use requires thorough documentation.

On occasion, then, bitwise operations may legitimately be used for fast arithmetic; but using arithmetic to pack bits is rarely necessary. This line of code is probably wrong 2:

r = h << 4 + 1;

The programmer probably intended the (bitwise) shift to happen before the (arithmetic) addition, like this.

r = (h << 4) + 1;

If we stick to bitwise operations, things become clear. I’ve written the 1 in hexadecimal as a hint it’s being used as a bit pattern — sadly there’s no way of writing a binary literal directly in C.

r = h << 4 | 0x1;

Anyway, the problem line in unpack() adheres to my rule of thumb: & and ^ are indeed both bitwise operations. But after some puzzling I realised the author of the code intended 2^pos to mean 2 to the power of pos, not for its arithmetic value, but for its bit pattern — which, as every programmer knows, is a 1 followed by pos 0s. That is, a 1 left shifted pos times.

Here’s what I thought the fix should be. Note, incidentally, that I’ve used ~0 rather than 255, because it clearly says “set every bit”. I’m also using unsigned integers throughout — always a good idea when working with bits. Some programmers might prefer to parenthesise the expression byte & 1 << pos, but I prefer the form shown: it’s easy enough to remember that & groups with the logical operators and << with the (higher precedence) arithmetic ones.

void
unpack(unsigned char const * bits, unsigned n_bits,
       unsigned char * buf)
{
    unsigned char bit, byte;
    unsigned b, pos;

    for (b = 0; b != n_bits; ++b)
    {
        byte = bits[b / 8];
        pos = 7 - b % 8;
        bit = byte & 1 << pos;
        buf[b] = bit == 0 ? 0 : ~0;
    }
}

Despite the absence of documentation, this is now at least a coherent function. It’s a “biterator” which steps through a collection of bits (packed into bytes, the smallest memory units C offers). Each time it encounters a set/clear bit, it sets/clears all the bits in the next byte in the output buffer. That is, it expands each bit value to fill a whole byte.

This is exactly the kind of function which is surprisingly fiddly to write but simple to unit test. As already mentioned, though, the function didn’t actually exist in the form shown, and the tests were all at the module level. The responsible way for me to proceed was to create a module test which exposed the defect, then make my candidate fix, confirm it did indeed fix the defect, then check the change in.

Unit Test

Here’s how simple a unit test for unpack() could be. It may be longer than the function it’s testing, but it’s less complex. And with just a couple of test cases, it manages to cover several interesting corners of the functionality. Better still, it passes^[3]!

void
test_unpack()
{
    // Start with a varied bit-pattern.
    // Ensure each byte differs from its reversed self.
    unsigned char const bits[2] = 
    {
        1 << 7 | 1 << 5 | 1 << 4 | 1 << 0, // 10110001 binary
        1 << 6 | 1 << 5 | 1 << 3 | 1 << 0, // 01101001 binary
    };
    // The expected output expands bits to bytes (0 => 0, 1 => ~0)
    unsigned char expected[2 * 8] = 
    {
        ~0, 0, ~0, ~0, 0, 0, 0, ~0,
        0, ~0, ~0, 0, ~0, 0, 0, ~0
    };
    unsigned char buf[3 * 8] = { 0 };
    unsigned char buf_copy[3 * 8] = { 0 };

    size_t const buf_size = sizeof(buf);

    // Fill the buffer with a pattern of 1s and 0s.
    // Unpack nothing and check nothing changes.
    memset(buf, 0xa5, buf_size);
    memcpy(buf_copy, buf, buf_size);
    unpack(bits, 0, buf);
    assert(memcmp(buf, buf_copy, buf_size) == 0);

    // Unpack some of the bits and check the results.
    // Also check the remainder of the buffer is undamaged.
    unpack(bits, 13, buf);
    assert(memcmp(buf, expected, 13) == 0);
    assert(memcmp(buf + 13, buf_copy + 13, buf_size - 13) == 0);
}

This is white-box testing: the test knows enough about the implementation of unpack() to expose potential problems. In this case, there’s something unusual about the way the pos counter goes down as the bit counter b goes up, so we make sure that the bits we’re unpacking form asymmetric patterns.

Refactoring

Should we extract this tested unpack() function from its surrounding, larger, more complex function? Is it safe to do so? Have we time to spend making changes with no externally visible results? Should we tweak unpack() for efficiency (after all, it doesn’t need to use the division and modulus operators each time round the loop)?

Working Effectively with Legacy Code cover

These are important questions. eXtreme Programmers refactor mercilessly, confident their extensive test frameworks will provide a safety net. Java programmers select the code block in their IDE then click the “extract method” button. C and C++ programmers have less advanced tools, but Michael Feathers’ “Working Effectively with Legacy Code” offers practical advice on how to transform code safely — that is, how to put it under test.

In the real world, we judge each case on merit. A nag email from the build server shouldn’t trigger mass refactoring, even if the test infrastructure is in place. I think Feathers is right though, that poorly tested code is on its way to becoming legacy code: hard to adapt, unpleasant to work with, and a drag on continuing development.

Lessons

This new story repeats the same old lessons, and I think they bear repeating:

Set up a build server. Listen to it.
Compile on multiple platforms.
Think! Compilers are concerned with syntax, not semantics. A C compiler reads your code in order to rewrite it for the machine’s benefit; understanding the code is your job.
Write small functions. Unit test them.
Integers and bit arrays are different.
Take care when using bitwise operations as arithmetic shortcuts.
Avoid using arithmetic for bit packing.

Oh, and in C, don’t mistake ^ for exponentiation!

This article was originally published in CVu, a print journal for ACCU members, and I would like to thank all at CVu for their help with it.

1 I’m not complaining about GCC which did an outstanding job of flagging a genuine problem in perfectly well-defined and valid code. The other compiler frequently used on this project, MSVC V8.0, compiles this cleanly, at the same time warning standard C string functions are unsafe and deprecated!

2 I’ve taken this example directly from Andrew Koenig’s “C Traps and Pitfalls”. This is a nice little book which expands on the ideas presented in a paper of the same name [PDF].

[3] One thing I recommend, though, is to temporarily reverse the logic in the assertions and check they then fail. Unit test frameworks often provide hooks to do this reversed-result test, which confirms the test cases are actually being run.

Word Aligned

professionalism in programming