The C++ standard dictates that member variables inside a single access section must be layed out in memory in the same order they were declared in. At the same time, compilers are free to choose the mutual ordering of the access sections themselves. This freedom makes it impossible in theory to link binaries created by different compilers. So what are the remaining reasons for the strict in-section ordering? And does the upcoming C++09 standard provide a way to fully determine object layouts "by hand"?
This freedom makes it impossible in theory to link binaries created by different compilers.
It's impossible for a number of reasons, and structure layout is the most minor. vtables, implementations of operator new
and delete
, data type sizes...
So what are the remaining reasons for the strict in-section ordering?
C compatibility, I would have thought, so that a struct defined in C packs the same way it does in C++ for a given compiler set.
And does the upcoming C++09 standard provide a way to fully determine object layouts "by hand"?
No, no more than the current standard does.
For a class
or struct
with no vtable and entirely private (or public) fields, though, it's already possible if you use the [u]int[8|16|32|64]_t
types. What use case do you have for more than this?
EDIT: I'm afraid i misunderstood your question
I think it's an optimization for memory access. For example, if we got this structure:
struct example
{
int16 intData;
byte byteData;
int32 intData;
byte intData;
int32 intData;
}
Let's suppose the word in this platform are 32 bits. Then you will need 4 complete words to pass all the data in the struct:
int16 + byte = 24 bits (the netx field doesn't fit here)
int32 = 32 bits (the netx field doesn't fit here)
byte = 8 bits (the netx field doesn't fit here)
int32 = 32 bits
But if you rearrange the fields to:
struct example
{
int16 intData;
byte byteData;
byte intData;
int32 intData;
int32 intData;
}
then you can save one memory access.
[edit] I learnt something new today! found the following standard quote:
Nonstatic data members of a (non-union) class declared without an intervening access-specifier are allocated so that later members have higher addresses within a class object. The order of allocation of nonstatic data members separated by an access-specifier is unspecified (11.1). Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions (10.3) and virtual base classes (10.1).
Interesting - i have no idea why this degree of freedom is given. Continuing to th rest of my previous reply...
As mentioned, the reason for preserving the ordering is C compatibility, and back then I guess noone thought of benefits of reordering members, while memory layout was typically done by hand anyway. Also, what now would be considered "ugly tricks" (like zeroing selected members with memset, or having two structs with the same layout) were quite common.
The standard does not give you a way to enforce a given layout, but most compilers provide measures to control padding, e.g. #pragma pack on MSVC compilers.
The reason for automatic padding is platform portability: different architectures have different alignment requirements, e.g. some architectures throw on misaligned ints (and these were the simple cases back then).
You are never supposed to link objects created by different compilers. Even if what you talk about is changed, you would still have far more issues that prevent you to link against another compilers' generated files. (aligning, name mangling, calling conventions to only name a few of them).
One reason the compiler is free to order access sections around might be so the compiler could establish an order for the access sections: members with lower addresses are more protected than members with higher addresses, for example.
You would not gain anything if that reordering wasn't allowed: Only PODs provide C compatibility and a way to give you byte offsets of members inside a class/struct (using the macro offsetof
) or allows you to memcpy them. A type will become non-POD if you define a custom constructor, copy constructor, a private member or some other stuff. In particular, deriving from a class currently breaks PODness.
C++1x lowers the requirements for PODs. For example, in C++1x std::pair<T, U>
is actually a POD, even though it provides its own constructor (which has to fit certain rules though).