1. C is not your friend: sizeof and side-effects

    The sizeof operator in C tells us how much space the result of an expression would use if we were to evaluate it, but doesn’t actually evaluate the expression. For example, sizeof i++ yields the size of the variable i, but doesn’t increment i. Many implementations of C provide a typeof operator too, which provides the type of an expression without evaluating it.

    That lack of side-effects is useful, for example in macro hygiene. Preprocessor macros have surprising behaviour when invoked with arguments that have side-effects:

      #define BAD_MACRO(arg) do {                          \
          /* Aiee, what if someone uses BAD_MACRO(i++)? */ \
          printf("%i squared is %i\n", arg, arg * arg);    \
      } while (0)
    
      #define GOOD_MACRO(arg) do {                                      \
         typeof (arg) local; /* 'arg' not evaluated, no side-effects */ \
         local = arg;        /* 'arg' evaluated exactly once, here */   \
         printf("%i squared is %i\n", local, local * local);            \
      } while (0)
    

    Now that’s a pretty well known pitfall, and it’s common practice to use typeof and sizeof to avoid side-effects this way. But C has one more surprise lying in wait for the unwary: sometimes sizeof and typeof do have side-effects! C supports declaring arrays whose length is determined at run time, e.g.:

      void do_something(int i) {
          char working_array[i];
          /* ... */
      }
    

    The length of the working array depends on the value of i. And of course we can ask the compiler to tell us the size of a dynamic array, too: sizeof (char[i]) is equivalent to i * sizeof char, and will be evaluated at run time.

    We can use any integer expression to specify the length of the array, even one with side-effects. If we do that, the compiler has no (reliable) way to know the size without evaluating the whole expression, including its side-effects. So that’s what it does:

      int i = 0;
      printf("%ld\n", sizeof (char[++i]));
    

    increments i and prints “1”. More generally, sizeof will evaluate any array-size expression in its argument if the final size depends on the value of the expression. Worse than that, according to the spec, sizeof may evaluate such expressions even if it doesn’t have to:

      Where a size expression is part of the operand of a sizeof
      operator and changing the value of the size expression would not
      affect the result of the operator, it is unspecified whether or
      not the size expression is evaluated.
    

    Here, “unspecified” means the compiler may choose to do it or not and doesn’t even have to document which. That is to say:

      /* Not a size expression: won't increment i. */
      sizeof (++i);
    
      /* Size expression that affects the final size: will increment i. */
      sizeof (char[++i]);
    
      /* Size expression that doesn't affect the final size (a pointer to
       * an array is the same size regardless of the size of the array).
       * May or may not increment i! */
      sizeof (char(*)[++i]);
    

    typeof isn’t in the standard but behaves similarly on compilers I’ve looked at. C is reaching PHP-like levels of confusion here, but luckily for us this combination (passing a type or cast to sizeof/typeof, using arrays with run-time sizes, and using expressions with side-effects) is pretty hard to hit. I’ve never seen it in “real” code, though I can imagine a set of clever helper macros might combine to cause it somehow.

     
    1. mamspaczonepoczuciehumoru reblogged this from tim-deegan
    2. tim-deegan posted this