Tricky Preprocessors
You learn something every day. A quick quiz from a mailing list I’m on: what does this C program print?
#include <stdio.h>
#define MINUS(x) -x
int main()
{
int n = 5;
printf("%d\n", MINUS(-n));
}
I like this one: there are three possible ways of coming up with an answer,
depending upon how deeply you look into it. The obvious approach is to say
“well, MINUS()
returns the negative value of whatever it’s passed, and we’re
passing -5, so the answer is 5”. And you’d be right, but for the wrong
reason.
Slightly more thought, and you might come to the conclusion that the macro expands with two leading minus signs, like this:
printf("%d\n", --n);
And --n
is obviously the prefix decrement operator applied to n
, so the
printed value is the result of n
after that decrement, or 4 (and due to a bug
we’ll get to in a second, this happens to be what you get if you compile with
Microsoft’s Visual C).
But that’s not the whole story. If you compile the program using gcc, you’ll
see that the answer printed (and the correct answer) is 5. So what’s going
on? Well, the C/C++ standards say that the output of the preprocessing stage
is a stream of tokens, not characters, so the macro expands to
‘operator-negation operator-negation 5’. Visual C doesn’t happen to follow
this part of the standard, so it takes the character string --n
from the
output of the preprocessor and interprets it as the prefix decrement operator.
If you’ve used gcc’s preprocessor directly, you’ll know that it also outputs a
character stream. So how does gcc work? If you take a look at the
preprocessed source (gcc -E -x c
), you’ll see that the output for the
relevant line is actually:
printf("%d\n", - -n);
What’s going on here is that gcc’s preprocessor inserts just enough extra
whitespace to ensure that the two negation operators can’t be mistaken for the
prefix decrement operator. In other words, while the preprocessor outputs a
stream of characters, it manipulates the output so that (from the point of view
of a consumer), it behaves ‘as if’ it had output a stream of tokens. (And if
you replace the definition of MINUS
with the more typical -(x)
(making the
expansion -(-5)
), you’ll see that gcc no longer adds in that extra whitespace
— it’s only added when necessary to affect the result).