The sizeof operator in C tells us how much space the
result of an expression would use if we were to evaluate it, but
doesn’t actually evaluate the expression. For example,
sizeof i++ yields the size of the variable i,
but doesn’t increment i. Many implementations of C
provide a typeof operator too, which provides the
type of an expression without evaluating it.
That lack of side-effects is useful, for example in macro hygiene. Preprocessor macros have surprising behaviour when invoked with arguments that have side-effects:
#define BAD_MACRO(arg) do { \
/* Aiee, what if someone uses BAD_MACRO(i++)? */ \
printf("%i squared is %i\n", arg, arg * arg); \
} while (0)
#define GOOD_MACRO(arg) do { \
typeof (arg) local; /* 'arg' not evaluated, no side-effects */ \
local = arg; /* 'arg' evaluated exactly once, here */ \
printf("%i squared is %i\n", local, local * local); \
} while (0)
Now that’s a pretty well known pitfall, and it’s common practice to use typeof and sizeof to avoid side-effects this way. But C has one more surprise lying in wait for the unwary: sometimes
sizeof and typeof do have side-effects!
C supports declaring arrays whose length is determined at run time,
e.g.:
void do_something(int i) {
char working_array[i];
/* ... */
}
The length of the working array depends on the value of i. And of course
we can ask the compiler to tell us the size of a dynamic array, too:
sizeof (char[i]) is equivalent to i * sizeof char,
and will be evaluated at run time.
We can use any integer expression to specify the length of the array, even one with side-effects. If we do that, the compiler has no (reliable) way to know the size without evaluating the whole expression, including its side-effects. So that’s what it does:
int i = 0;
printf("%ld\n", sizeof (char[++i]));
increments i and prints “1”. More generally, sizeof
will evaluate any array-size expression in its argument if the final size
depends on the value of the expression. Worse than that, according to the
spec, sizeof may evaluate such expressions even if it doesn’t have to:
Where a size expression is part of the operand of a sizeof operator and changing the value of the size expression would not affect the result of the operator, it is unspecified whether or not the size expression is evaluated.
Here, “unspecified” means the compiler may choose to do it or not and doesn’t even have to document which. That is to say:
/* Not a size expression: won't increment i. */ sizeof (++i); /* Size expression that affects the final size: will increment i. */ sizeof (char[++i]); /* Size expression that doesn't affect the final size (a pointer to * an array is the same size regardless of the size of the array). * May or may not increment i! */ sizeof (char(*)[++i]);
typeof isn’t in the standard but behaves similarly on
compilers I’ve looked at. C is reaching PHP-like levels of confusion
here, but luckily for us this combination (passing a type or cast to
sizeof/typeof, using arrays with run-time sizes,
and using expressions with side-effects) is pretty hard to hit. I’ve
never seen it in “real” code, though I can imagine a set of clever helper
macros might combine to cause it somehow.