views:

1768

answers:

7

I'm writing C cross-platform library but eventually I've got error in my unittests, but only on Windows machines. I've tracked the problem and found it's related to alignment of structures (I'm using arrays of structures to hold data for multiple similar objects). The problem is: memset(sizeof(struct)) and setting structures members one by one produce different byte-to-byte result and therefore memcmp() returns "not equal" result.

Here the code for illustration:

#include <stdio.h>
#include <string.h>

typedef struct {
    long long      a;
    int            b;
} S1;

typedef struct {
    long           a;
    int            b;
} S2;

S1 s1, s2;

int main()
{
    printf("%d %d\n", sizeof(S1), sizeof(S2));

    memset(&s1, 0xFF, sizeof(S1));
    memset(&s2, 0x00, sizeof(S1));

    s1.a = 0LL; s1.b = 0;

    if (0 == memcmp(&s1, &s2, sizeof(S1)))
        printf("Equal\n");
    else
        printf("Not equal\n");

    return 0;
}

This code with MSVC 2003 @ Windows produce following output:

16 8
Not equal

But the same code with GCC 3.3.6 @ Linux works as expected:

12 8
Equal

This makes my unit-testing very hard.

Am I understand correctly that MSVC uses size of biggest native type (long long) to determine alignment to structure?

Can somebody give me advice how can I change my code to make it more robust against this strange alignment problem? In my real code I'm working with arrays of structures via generic pointers to execute memset/memcmp and I'm usually don't know exact type, I have only sizeof(struct) value.

+1  A: 

What we have done is used the #pragma pack to specify how big the objects should be:

#pragma pack(push, 2)

typedef struct {
    long long      a;
    int            b;
} S1;

typedef struct {
    long           a;
    int            b;
} S2;

#pragma pack(pop)

If you do this, the structures will be the same size on both platforms.

Lyndsey Ferguson
The size of each type such as long, int, etc, should be the same, but the structure size may differ as each compiler is aligning the structures optimally and "padding" is added to allow this alignment to work.
Lyndsey Ferguson
+1 just what I was going to post
Eric Petroelje
Why you suggest alignment on 2 bytes (pack(push,2))?
bialix
Actually, I don't. We use this alignment for historic reasons: the files we wrote to disk with an old compiler on the Mac compiler (CodeWarrior) used 2 byte alignment. So we have to use this alignment to read in old files...
Lyndsey Ferguson
+1  A: 

You can either do something like

#ifndef _MSC_VER
#pragma pack(push)
#pragma pack(16)
#endif
/* your struct defs */

#ifndef _MSC_VER
#pragma pack(pop)
#endif

to give a compiler directive forcing alignment

Or go into the project options and change the default struct alignment [under Code Generation]

Nicholas Mancuso
I think you actually mean #ifdef _MSC_VER
bialix
+2  A: 

GCC Manual:

Note that the alignment of any given struct or union type is required by the ISO C standard to be at least a perfect multiple of the lowest common multiple of the alignments of all of the members of the struct or union in question.

Also, this typically introduces an element of padding (i.e. filler bytes to have the structure aligned). You can use the #pragma with an argument of packed. Note, #pragmas are NOT a portable way of working. Unfortunately, this is also about the only way of working in your case.

References: Here GCC on structure alignment. MSDN structure alignment.

dirkgently
+1  A: 

Note that this is not a 'strange' alignment problem. MSVC has chosen to ensure that the struct is aligned on a 64-bit boundary since it has a 64-bit member so it adds some padding at the end of the struct to ensure that arrays of those objects will have each element properly aligned. I'm actually surprised that GCC doesn't do the same.

I'm curious what you're unit testing does that hits a snag with this - most of the time alignment of structure members isn't necessary unless you need to match a binary file format or a wire protocol or you really need to reduce the memory used by a structure (especially used in embedded systems). Without knowing what you're trying to do in your tests I don't think a good suggestion can be given. Packing the structure might be a solution, but it comes at some cost - performance (especially on non-Intel platforms) and portability (how struct packing is set up is can be different from compiler to compiler). These may not matter to you, but there might be a better way to solve the problem in your case.

Michael Burr
A: 

Structure padding for 64-bit values is different on different compilers. I've seen differences between even between gcc targets.

Note that explicitly padding to 64-bit alignment will only hide the problem. It will come back if you begin naively nesting structures, because the compilers will still disagree on the natural alignment of the inner structures.

HUAGHAGUAH
+1  A: 

Your unit test's expectation is wrong. It (or the code it tests) should not scan the structure's buffer byte-by-byte. For byte-precise data the code should create a byte buffer explicitly on stack or on heap and fill it with the extracts from each member. The extracts can be obtained in CPU-endianness-independent way by using the right shift operation against the integer values and casting the result by the byte type such as (unsigned char).

BTW, your snippet writes past s2. You could fix that by changing this

memset(&s2, 0x00, sizeof(S1));

s1.a = 0LL; s1.b = 0;

if (0 == memcmp(&s1, &s2, sizeof(S1)))

to this,

memset(&s2, 0x00, sizeof(S2));

s1.a = 0LL; s1.b = 0;

if (0 == memcmp(&s1, &s2, sizeof(S2)))

but the result is technically "undefined" because the alignment of members in the structures is compiler-specific.

eel ghEEz
this is a simple example that illustrates the problem. my real code is more complex than that. so don't try to tell me there is something wrong. you don't see full picture.
bialix
+1 for the correct answer among a sea of bad answers.
R..
A: 

Viva64 blog: Change of type alignment and the consequences

Andrey Karpov