views:

828

answers:

4

I have C# background. Very newbie to low level language like C.

In C#, struct's memory laid out by compiler by default. Compiler can re-order data fields or pads additional bits between fields implicitly. So I had to specify some special attribute to override this behavior for exact layout.

As I know, C does not re-order or align memory layout of a struct by default. But I heard there's a little exceptions which very hard to find.

Can I know about C's memory layout behavior? (what should be re-ordered/aligned and not)

+3  A: 

You can start by reading the data structure alignment wikipedia article to get a better understanding of data alignment.

From the wikipedia article:

Data alignment means putting the data at a memory offset equal to some multiple of the word size, which increases the system's performance due to the way the CPU handles memory. To align the data, it may be necessary to insert some meaningless bytes between the end of the last data structure and the start of the next, which is data structure padding.

From 6.54.8 Structure-Packing Pragmas of the GCC documentation:

For compatibility with Microsoft Windows compilers, GCC supports a set of #pragma directives which change the maximum alignment of members of structures (other than zero-width bitfields), unions, and classes subsequently defined. The n value below always is required to be a small power of two and specifies the new alignment in bytes.

  1. #pragma pack(n) simply sets the new alignment.
  2. #pragma pack() sets the alignment to the one that was in effect when compilation started (see also command line option -fpack-struct[=] see Code Gen Options).
  3. #pragma pack(push[,n]) pushes the current alignment setting on an internal stack and then optionally sets the new alignment.
  4. #pragma pack(pop) restores the alignment setting to the one saved at the top of the internal stack (and removes that stack entry). Note that enter code here#pragma pack([n]) does not influence this internal stack; thus it is possible to have #pragma pack(push) followed by multiple #pragma pack(n) instances and finalized by a single #pragma pack(pop).

Some targets, e.g. i386 and powerpc, support the ms_struct #pragma which lays out a structure as the documented _attribute_ ((ms_struct)).

  1. #pragma ms_struct on turns on the layout for structures declared.
  2. #pragma ms_struct off turns off the layout for structures declared.
  3. #pragma ms_struct reset goes back to the default layout.
jschmier
Thanks for care. I modified question as you guided.
Eonil
+7  A: 

In C, the compiler is allowed to dictate some alignment for every primitive type. Typically the alignment is the size of the type. But it's entirely implementation-specific.

Padding bytes are introduced so every object is properly aligned. Reordering is not allowed.

Possibly every remotely modern compiler implements #pragma pack which allows control over padding and leaves it to the programmer to comply with the ABI. (It is strictly nonstandard, though.)

From C99 §6.7.2.1:

12 Each non-bit-field member of a structure or union object is aligned in an implementation- defined manner appropriate to its type.

13 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

Potatoswatter
Some compilers (i.e. GCC) implement the same effect as `#pragma pack` but with more fine-grained control over the semantics.
Chris Lutz
I'm surprised to see a downvote. Can anyone point out the error?
Potatoswatter
Thanks for care. I updated question as you guided.
Eonil
A: 

In C, structures are laid out nearly exactly as you specify in the code. Similar to C#'s StructLayout.Sequential.

The only difference is in member alignment. This never reorders the data members in a structure, but can change the size of the structure by inserting "pad" bytes in the middle of the structure. The reason for this is to make sure that each of the members starts on a boundary (usually 4 or 8 bytes).

For example:

struct mystruct {
   int a;
   short int b;
   char c;
};

The size of this structure will usually be 12 bytes (4 for each member). This is because most compilers will, by default, make each member the same size as the largest in the structure. So the char will take up 4 bytes instead of one. But it is very important to note that sizeof(mystruct::c) will still by 1, but sizeof(mystruct) will be 12.

It can be hard to predict how the structure will be padded/aligned by the compiler. Most will default as I have explained above, some will default to no padding/alignment (also sometimes called "packed").

The method for altering this behavior is very compiler dependent, there is nothing in the language specifying how this is to be handled. In MSVC you would use #pragma pack(1) to turn off alignment (the 1 says align everything on 1 byte boundaries). IN GCC you would use __attribute__((packed)) in the structure definition. Consult the documentation for your compiler to see what it does by default and how to change that behavior.

SoapBox
Uh, `sizeof(struct mystruct)` prints 8 on my system. C doesn't align all members to the alignment of the biggest member, it aligns all members to their alignment, and then aligns the struct to the alignment of the biggest member.
Chris Lutz
Uh, as I said, it depends on the compiler.
SoapBox
Soapbox: Not if no compiler does it that way.
Potatoswatter
So you've investigated every compiler and know that none of them do it that way?
SoapBox
No, but if you claim that one exists, you should have an example. It sounds like a totally unreasonable way to do things, and the word "usually" would imply that such compilers are common. I do know that none of the popular ones work like that.
Potatoswatter
Thanks for care. I updated my question as you guided.
Eonil
+1  A: 

It's implementation-specific, but in practice the rule (in the absence of #pragma pack or the like) is:

  • Struct members are stored in the order they are declared. (This is required by the C99 standard, as mentioned here earlier.)
  • If necessary, padding is added before each struct member, to ensure correct alignment.
  • Each primitive type T requires an alignment of sizeof(T) bytes.

So, given the struct:

struct ST
{
   char ch1;
   short s;
   char ch2;
   long long ll;
   int i;
};
  • ch is at offset 0
  • a padding byte is inserted to align...
  • s at offset 2
  • ch2 is at offset 4, immediately after s
  • 3 padding bytes are inserted to align...
  • ll at offset 8
  • i is at offset 16, right after ll
  • 4 padding bytes are added at the end so that the overall struct is a multiple of 8 bytes. I checked this on a 64-bit system: 32-bit systems may allow structs to have 4-byte alignment.

So sizeof(ST) is 24.

It can be reduced to 16 bytes by rearranging the members to avoid padding:

struct ST
{
   long long ll; // @ 0
   int i;        // @ 8
   short s;      // @ 12
   char ch1;     // @ 14
   char ch2;     // @ 15
} ST;
dan04