views:

158

answers:

6

I got a structure like this:

struct bar {
    char x;
    char *y;
};

I can assume that on a 32 bit system, that padding for char will make it 4 bytes total, and a pointer in 32 bit is 4, so the total size will be 8 right?

I know it's all implementation specific, but I think if it's within 1-4, it should be padded to 4, within 5-8 to 8 and 9-16 within 16, is this right? it seems to work.

Would I be right to say that the struct will be 12 bytes in a x64 arch, because pointers are 8 bytes? Or what do you think it should be?

+1  A: 

Its a compiler switch, you can't assume anything. If you assume you may get into trouble.

For instance in Visual Studio you can decide using pragma pack(1) that you want it directly on the byte boundary.

Anders K.
+4  A: 

I can assume that on a 32 bit system, that padding for char will make it 4 bytes total, and a pointer in 32 bit is 4, so the total size will be 8 right?

It's not safe to assume that, but that will often be the case, yes. For x86, fields are usually 32-bit aligned. The reason for this is to increase the system's performance at the cost of memory usage (see here).

Would I be right to say that the struct will be 12 bytes in a x64 arch, because pointers are 8 bytes? Or what do you think it should be?

Similarly, for x64, fields are usually 64-bit/8-byte aligned, so sizeof(bar) would be 16.

As Anders points out, however, all this goes flying out the window once you start playing with alignment via /Zp, the pack directive, or whatever else your compiler supports.

Chris Schmich
It's not safe to assume at all ;)
Exa
Making any assumption about alignment is stupid. Do not do it. Even changing the compiler flags can cause the padding/alignment to change.
Martin York
@Martin, if you do change the alignment or packing (which affects structure member alignment) you must know what you are doing, since your code won't be able to call standard C library and POSIX functions.
Maxim Yegorushkin
@Maxim: Avoiding POSIX for the moment. Even changing from release to debug could change the packing. A lot of systems provide two seprate runtimes one for debug one for release. Thus allowing for the potential for different packing/alignment between the two different versions. Not to mention systems that don't have a real runtime :-) There is no requirement for a compiler to have the same characteristics when different flags are used. This is why build systems tend to build all files with the same flags and debug/release binaries are built into seprate directories so they are not combined.
Martin York
@Martin. I am not aware of any of the systems where debug and release builds have different alignment. Debug builds have optimizations off, extra run-time checks and debug symbols on and that is all the difference. Moreover, having different alignments leads to bugs not discovered in debug build, so it defeats the purpose of debug builds. Could you provide an example of a system/platform where debug and release builds have different alignment and why.
Maxim Yegorushkin
@Maxim Yegorushkin: Windows: Use a different heap implementation between debug and release so mixing the two is not compatible. STL: There are version that provide different behavior based on debug/release mode this is done by having different (incompatible) object to implement these pieces. gcc: is good, adding the -g flag (for debugging) does not change object compatibility (thus allowing debug of optimized code). BUT it does not guarantee object compatibility between optimization levels (because it does not specify the exact optimizations that will be done (and some are incompatible))
Martin York
@Maxim Yegorushkin: It is one of these areas that is undefined behavior (or is it implementation defined). Either way it may work now but may nor work in the future. Thus it can not (and therefore should not) be relied upon.
Martin York
@Martin: yes, windows heaps are screwed up, however, this has nothing to do with alignment at all. debug and release windows heaps implemented differently, the alignment is still the same.
Maxim Yegorushkin
@Maxim Yegorushkin: Yes that is correct (for the current (and previous versions of the compiler)). Though as the whole runtime is different for each version there is the possibility of the compiler implementing different alignments for release or debug. Would that not keep you awake all night whenever there was a new DevStudio update? Please read my original comment carefully. I never said they did. I just expressed concern that there was a potential for the compiler to do so (it is within its allowed behavior).
Martin York
@Martin: you are missing one important point: platform ABI. It covers details such as data types, size, and alignment and the calling convention. That is, if your compiler does not conform, it is pretty much useless. This is why any and all compilers for the same platform use the same data type sizes, alignment and packing.
Maxim Yegorushkin
@Maxim Yegorushkin: Actually that is exactly why C++ compilers are such a pain to match. Each has its own ABI. Therefore object files from different compilers (different version of the same compiler and even sometimes the same compiler with different flags) are not compatable (Why do you think it is standard for makefiles to build release objects into one folder and debug objects into another (Why do ALL IDE follow this convention)). Also with two seprate runtimes each runtime can potentially support two diferent ABI's (though currently they don't (I Think)). Note: The OS API is usually "C"
Martin York
@Martin: different C++ compilers use different exception handling mechanisms, layouts for virtual tables and different name mangling conventions (there is no standard, although intel and g++ on Linux use a standard Itanium C++ ABI). But the object files are still linked by C linkers. Moreover, C++ compilers still use platforms ABI data sizes, alignment and structure packing.Regarding different folders for debug and release: you don't want to recompile the world when you switch between debug and release build and you don't want mix and match debug and release objects.
Maxim Yegorushkin
@Maxim Yegorushkin: On Intel platforms (and thus AMD I believe) they came together on an ABI (that is practically the C ABI). So yes that is correct (but a very limited view of C++). But nearly every other platform that is not true (look at HP (gcc Vs Acc) Sun (gcc Vs Forte) (any other). As for linker each tool chain uses its own linker these are not plug and play between compilers. The commonality between them is the C ABI the C++ ABI is never the same (as each compiler is trying to optimse the ABI in their own way).
Martin York
+1  A: 

You can't assume anything in general. Every platform decides its own padding rules.

That said, any architecture that uses "natural" alignment, where operands are padded to their own size (necessary and sufficient to avoid straddling naturally-aligned pages, cachelines, etc), will make bar twice the pointer size.

So, given natural alignment rules and nothing more, 8 bytes on 32-bit, 16 bytes on 64-bit.

Potatoswatter
+1  A: 

$9.2/12-

Nonstatic data members of a (non-union) class declared without an intervening access-specifier are allocated so that later members have higher addresses within a class object. The order of allocation of nonstatic data members separated by an access-specifier is unspecified (11.1). Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions (10.3) and virtual base classes (10.1).

So, it is highly implementation specific as you already mentioned.

Chubsdad
A: 

Not quite.

Padding depends on the alignment requirement of the next member. The natural alignment of built-in data types is their size.

There is no padding before char members since their alignment requirement is 1 (assuming char is 1 byte).

For example, if a char (again assume it is one byte) is followed by a short, which, say, is 2 bytes, there may be up to 1 byte of padding because a short must be 2-byte aligned. If a char is followed by double of the size of 8, there may be up to 7 bytes of padding because a double is 8-byte aligned. On the other hand, if a short is followed by a double, the may be up to 6 bytes of padding.

And the size of a structure is a multiple of the alignment of a member with the largest alignment requirement, so there may be tail padding. In the following structure, for instance,

struct baz {
    double d;
    char c;
};

the member with the largest alignment requirement is d, it's alignment requirement is 8, Which gives sizeof(baz) == 2 * alignof(double). There is 7 bytes of tail padding after member c.

gcc and other modern compilers support __alignof() operator. There is also a portable version in boost.

Maxim Yegorushkin
A: 

As others have mentioned, the behaviour can't be relied upon between platforms. However, if you still need to do this, then one thing you can use is BOOST_STATIC_ASSERT() to ensure that if the assumptions are violated then you find out at compile time, eg

#include <boost/static_assert.hpp>
#if ARCH==x86                             // or whatever the platform specific #define is
  BOOST_STATIC_ASSERT(sizeof(bar)==8);
#elif ARCH==x64
  BOOST_STATIC_ASSERT(sizeof(bar)==16);
#else ...

If alignof() is available you could also use that to test your assumption.

the_mandrill