Why does the 'sizeof' operator return a size larger for a structure than the total sizes of the structure's members?
views:
3382answers:
6This is because of structure alignment. Structure alignment refers to the ability of the compiler to insert unused memory into a structure so that data members are optimally aligned for better performance. Many processors perform best when fundamental data types are stored at byte-addresses that are multiples of their sizes.
Here's an example using typical settings for an x86 processor:
struct X
{
short s; /* 2 bytes */
/* 2 padding bytes */
int i; /* 4 bytes */
char c; /* 1 byte */
/* 3 padding bytes */
};
struct Y
{
int i; /* 4 bytes */
char c; /* 1 byte */
/* 1 padding byte */
short s; /* 2 bytes */
};
struct Z
{
int i; /* 4 bytes */
short s; /* 2 bytes */
char c; /* 1 byte */
/* 1 padding byte */
};
const int sizeX = sizeof(X); /* = 12 */
const int sizeY = sizeof(X); /* = 8 */
const int sizeZ = sizeof(X); /* = 8 */
One can minimize the size of structures by putting the largest data types at the beginning of the structure and the smallest data types at the end of the structure (like structure Z
in the example above).
IMPORTANT NOTE: Both the C and C++ standards state that structure alignment is implementation defined. Therefore each compiler may choose to align data differently, resulting in different and incompatible data layouts. For this reason, when dealing with libraries that will be used by different compilers, it is important to understand how the compilers align data. Some compilers have command-line settings and/or special #pragma
statements to change the structure alignment settings.
It can do so if you have implicitly or explicitly set the alignment of the struct. A struct that is aligned 4 will always be a multiple of 4 bytes even if the size of its members would be something that's not a multiple of 4 bytes.
Also a library may be compiled under x86 with 32-bit ints and you may be comparing its components on a 64-bit process would would give you a different result if you were doing this by hand.
This can be due to byte alignment and padding so that the structure comes out to an even number of bytes (or words) on your platform. For example in C on Linux, the following 3 structures:
#include "stdio.h"
struct oneInt {
int x;
};
struct twoInts {
int x;
int y;
};
struct someBits {
int x:2;
int y:6;
};
int main (int argc, char** argv) {
printf("oneInt=%d\n",sizeof(struct oneInt));
printf("twoInts=%d\n",sizeof(struct twoInts));
printf("someBits=%d\n",sizeof(struct someBits));
return 0;
}
Have members who's sizes (in bytes) are 4 bytes (32 bits), 8 bytes (2x 32 bits) and 1 byte (2+6 bits) respectively. The above program (on Linux using gcc) prints the sizes as 4, 8, and 4 - where the last structure is padded so that it is a single word (4 x 8 bit bytes on my 32bit platform).
oneInt=4
twoInts=8
someBits=4
If you want the structure to have a certain size with GCC for example use attribute((packed)).
On Windows you can set the alignment to one byte when using the cl.exe compier with the /Zp option.
Usually it is easier for the Operatin System to access data that is a multiple of 4 (or 8), depending platform and also on the compiler.
So it is a matter of alignment basically.
You need to have good reasons to change it.
In addition to the other answers, a struct can (but usually doesn't) have virtual functions, in which case the size of the struct will also include the space for the vtbl.