Why doesn't size of struct is equals to sum of sizes of its individual member types?

views:

answers:

+3 Q:

Why doesn't size of struct is equals to sum of sizes of its individual member types?

+6 A:

It is because the compiler uses padding to bring each element into word alignment that is specific to the architecture for which you are compiling.

It can be for one of several reasons but usually:

Because some CPU's simply cannot read a ~~long or long long~~ multi-byte value when it isn't on an address multiple of its own size.
Because CPU's that can read off-aligned data may do it much slower than aligned.

You can often force it off with a compiler-specific directive or pragma.

When you do this, the compiler will generate relatively inefficient code to access the off-aligned data using multiple read/write operations.

Amardeep 2010-07-29 20:18:45

`CPU's simply cannot read a multi-byte value`. Can you give few examples of such CPUs? Why cannot they read multi-byte values unless it isn't on a address multiple of its own size? I mean, what exactly is happening under neath? `word alignment that is specific to the architecture` So, each architecture specifies the word alignment value? Under which section can I usually find them in instruction manuals of an architecture?

claws 2010-07-29 20:46:36

One example is the ARM architecture. Another is MIPS. Their behavior is somewhat different. If you try to read a longword (32 bits) from address 0x000000005 on an ARM, it will actually read it from address 0x00000004 and give you the wrong value. On the MIPS you might get a memory error/exception. The specific requirements are very well described in their respective Instruction Set Architecture manuals, but I can't quote the exact section right from memory. As for why, they are low-gate-count RISC CPUs and simply don't have the hardware to shuffle bus lanes around to give you the result.

Amardeep 2010-07-29 21:34:53

The compiler can insert padding between members or at the end of the struct. Padding between members is typically done to keep the members aligned to maximize access speed. Padding at the end is done for roughly the same reason, in case you decide to create an array of the structs.

To prevent it from happening, you use something like #pragma pack(1).

Jerry Coffin 2010-07-29 20:20:46

+3 A:

This is called padding; which involves adding some more bytes in order to align the structure on addresses that are divisible by some special number, usually 2, 4, or 8. A compiler can even place padding between members to align the fields themselves on those boundaries.

This is a performance optimization: access to aligned memory addresses is faster, and some architectures don't even support accessing unaligned addresses.

For VC++, you can use the pack pragma to control padding between fields. However, note that different compilers have different ways of handling this, so if, for example, you also want to support GCC, you'll have to use a different declaration for that.

Michael Madsen 2010-07-29 20:24:41

+1 Thanks for explaining that controlling is compiler specific. But I really don't like when people leave just the rest 2% of information and stop making it complete. Please add what to use for GCC to make it complete.

claws 2010-07-29 20:29:21

@claws: As far as I'm aware, placing `__attribute__((__packed__));` after the } that ends the struct should do this - but I've never tried this myself with GCC (or VS, for that matter), and for all I know, there could be other ways as well. Of course, that's *still* not complete, because there are other compilers as well, such as Borland/CodeGear/Embarcadero C++ (and I have even *less* of an idea how to do it there).

Michael Madsen 2010-07-30 00:06:28

ansaurus

tags:

views:

answers:

Why doesn't size of struct is equals to sum of sizes of its individual member types?

related questions