views:

124

answers:

2

While answering another question, I thought of the following example:

void *p;
unsigned x = 17;

assert(sizeof(void*) >= sizeof(unsigned));
*(unsigned*)&p = 17;        // (1)
memcpy(&p, &x, sizeof(x));  // (2)

Line 1 breaks aliasing rules. Line 2, however, is OK wrt. aliasing rules. The question is: why? Does the compiler have special built-in knowledge about functions such as memcpy, or are there some other rules that make memcpy OK? Is there a way of implementing memcpy-like functions in standard C without breaking the aliasing rules?

+6  A: 

The memcpy function takes void* arguments, meaning that no assumptions are made about what is being pointed to; no aliasing has occurred here. In contrast, *(unsigned*)&p interprets a pointer to void* as a pointer to unsigned, which is aliasing.

Marcelo Cantos
How is direct aliasing, like above, different from: void *z = unsigned *px = z; *px = 17 , and how is this different from memcpy?
zvrba
Casting through `void *` lets you manually sidestep the aliasing rules. You can even do it in a single expression: `*(unsigned *)(void *)`.
Marcelo Cantos
+4  A: 

The C Standard is quite clear on it. The effective type of the object named by p is void*, because it has a declared type, see 6.5/6. The aliasing rules in C99 apply to reads and writes, and the write to void* through an unsigned lvalue in (1) is undefined behavior according to 6.5/7.

In contrast, the memcpy of (2) is fine, because unsigned char* can alias any object (6.5/7). The Standard defines memcpy at 7.21.2/1 as

For all functions in this subclause, each character shall be interpreted as if it had the type unsigned char (and therefore every possible object representation is valid and has a different value).

The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.

However if there exist a use of p afterwards, that might cause undefined behavior depending on the bitpattern. If such a use does not happen, that code is fine in C.


According to the C++ Standard, which in my opinion is far from clear on the issue, i think the following holds. Please don't take this interpretation as the only possible - the vague/incomplete specification leaves a lot of room for speculation.

Line (1) is problematic because the alignment of &p might not be ok for the unsigned type. It changes the type of the object stored in p to be unsigned int. As long as you don't access that object later on through p, aliasing rules are not broken, but alignment requirements might still be.

Line (2) however has no alignment problems, and is thus valid, as long as you don't access p afterwards as a void*, which might cause undefined behavior depending on how the void* type interprets the stored bitpattern. I don't think that the type of the object is changed thereby.

There is a long GCC Bugreport that also discusses the implications of a write through a pointer that resulted from such a cast and what the difference to placement-new is (people on that list aren't agreeing what it is).

Johannes Schaub - litb
Please see the question in the comment to Marcelo's answer. Any comments on that?
zvrba
@zvrba, oh there is nothing different. You just cast with `void*` in between, which is equal to casting directly. If you want to "emulate" memcpy, you have to do it like `unsigned char *pc = (unsigned char*) *pc = *x; *pc++ = *x++; ...`, which is what it does.
Johannes Schaub - litb