Do you have any horror stories to tell? The GCC Manual recently added a warning regarding -fstrict-aliasing and casting a pointer through a union:
[...] Taking the address, casting the resulting pointer and dereferencing the result has undefined behavior [emphasis added], even if the cast uses a union type, e.g.:
union a_union {
int i;
double d;
};
int f() {
double d = 3.0;
return ((union a_union *)&d)->i;
}
Does anyone have an example to illustrate this undefined behavior?
Note this question is not about what the C99 standard says, or does not say. It is about the actual functioning of gcc, and other existing compilers, today.
I am only guessing, but one potential problem may lie in the setting of d
to 3.0. Because d
is a temporary variable which is never directly read, and which is never read via a 'somewhat-compatible' pointer, the compiler may not bother to set it. And then f() will return some garbage from the stack.
My simple, naive, attempt fails. For example:
#include <stdio.h>
union a_union {
int i;
double d;
};
int f1(void) {
union a_union t;
t.d = 3333333.0;
return t.i; // gcc manual: 'type-punning is allowed, provided...' (C90 6.3.2.3)
}
int f2(void) {
double d = 3333333.0;
return ((union a_union *)&d)->i; // gcc manual: 'undefined behavior'
}
int main(void) {
printf("%d\n", f1());
printf("%d\n", f2());
return 0;
}
works fine, giving on CYGWIN:
-2147483648
-2147483648
Looking at the assembler, we see that gcc completely optimizes t
away: f1()
simply stores the pre-calculated answer:
movl $-2147483648, %eax
while f2()
pushes 3333333.0 onto the floating-point stack, and then extracts the return value:
flds LC0 # LC0: 1246458708 (= 3333333.0) (--> 80 bits)
fstpl -8(%ebp) # save in d (64 bits)
movl -8(%ebp), %eax # return value (32 bits)
And the functions are also inlined (which seems to be the cause of some subtle strict-aliasing bugs) but that is not relevant here. (And this assembler is not that relevant, but it adds corroborative detail.)
Also note that taking addresses is obviously wrong (or right, if you are trying to illustrate undefined behavior). For example, just as we know this is wrong:
extern void foo(int *, double *);
union a_union t;
t.d = 3.0;
foo(&t.i, &t.d); // undefined behavior
we likewise know this is wrong:
extern void foo(int *, double *);
double d = 3.0;
foo(&((union a_union *)&d)->i, &d); // undefined behavior
For background discussion about this, see for example:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1422.pdf
http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html
http://davmac.wordpress.com/2010/02/26/c99-revisited/
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule
http://stackoverflow.com/questions/2771023/c99-strict-aliasing-rules-in-c-gcc/2771041#2771041
In the first link, draft minutes of an ISO meeting seven months ago, one participant notes in section 4.16:
Is there anybody that thinks the rules are clear enough? No one is really able to interpret them.
Other notes: My test was with gcc 4.3.4, with -O2; options -O2 and -O3 imply -fstrict-aliasing. The example from the GCC Manual assumes sizeof(double) >= sizeof(int); it doesn't matter if they are unequal.
Also, as noted by Mike Acton in the cellperformace link, -Wstrict-aliasing=2
, but not =3
, produces warning: dereferencing type-punned pointer might break strict-aliasing rules
for the example here.