views:

86

answers:

2

Does it violate strict aliasing rules to move items of any type around using uint32_t, then read them back? If so, does it also violate strict aliasing rules to memcpy from an array of uint32_ts to an array of any type, then read the elements back?

The following code sample demonstrates both cases:

#include <assert.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>

int main(void) {
    const char *strings[5] = {
     "zero", "one", "two", "three", "four"
    };
    uint32_t buffer[5];
    int i;

    assert(sizeof(const char*) == sizeof(uint32_t));

    memcpy(buffer, strings, sizeof(buffer));

    //twiddle with the buffer a bit
    buffer[0] = buffer[3];
    buffer[2] = buffer[4];
    buffer[3] = buffer[1];

    //Does this violate strict aliasing?
    const char **buffer_cc = (const char**)buffer;
    printf("Test 1:\n");
    for (i=0; i<5; i++)
     printf("\t%s ", buffer_cc[i]);
    printf("\n");

    //How about this?
    memcpy(strings, buffer, sizeof(strings));
    printf("Test 2:\n");
    for (i=0; i<5; i++)
     printf("\t%s ", strings[i]);
    printf("\n");

    return 0;
}

Please disregard my assumption of a 32-bit platform. Also, if the elements aren't the same size as uint32_t, I know to pad them and copy the correct number of uint32_t's. My question focuses on whether or not doing so violates strict aliasing.

+1  A: 

buffer_cc[0] and strings[3] (for example) are pointers that reference the same memory location but are of the same type, thus don't violate strict aliasing. buffer[0] is not a pointer, thus doesn't violate strict aliasing. Aliasing optimizations arise when dereferencing pointers, so I wouldn't expect this to cause problems.

As you allude to in the code and the last paragraph in your question, the real problem in the sample code arises when pointers and uint32_t are of different sizes.

Also, you can always alias a char* to point to another type without violating strict aliasing, though not vice-versa.

outis
The lvalue expression `buffer_cc[0]` of type `const char *` aliases the object `buffer[0]` of type `uint32_t`. So I think caf is right.
Jason Orendorff
+2  A: 

The first loop does technically violate strict aliasing - it accesses uint32_t objects through an lvalue of type char *. It's hard to see how any optimiser would cause you a problem in this specific case, though. If you altered it a little so you were doing something like:

printf("\t%s ", buffer_cc[0]);
buffer[0] = buffer[3];
printf("\t%s ", buffer_cc[0]);

You might see the same string printed twice - since the optimiser would be within its rights to only load buffer_cc[0] into a register once, because the second line is only modifying an object of type uint32_t.

The second loop, that memcpys them back, is fine.

caf
"It's hard to see" - for instance if the "//How about this?" paragraph were removed, then I think the compiler is allowed to decide (through data flow analysis) that all that `buffer[0] = buffer[3];` stuff is dead code, since buffer is never used again in the program, and eliminate it. Certainly it can reorder it after the printfs, if it thinks that will be more efficient.
Steve Jessop
Actually, I think I'm wrong. Each `buffer_cc[i]` is a `char*`, so it can legally point to part of buffer, because a `char*` can alias anything. So as far as strict aliasing is concerned, the modifications to buffer can't be eliminated, and have to happen before `buffer_cc[i]` is passed to printf. They don't have to happen before `buffer_cc[0]` is loaded from memory, but unless that loop is unrolled, it would be very odd indeed for the compiler to leave them until that point. If the loop didn't printf, but used the pointers without dereferencing them, it'd be different.
Steve Jessop