tags:

views:

115

answers:

2

First of all, on my system the following hold: sizeof(char) == 1 and sizeof(char*) == 4. So simply, when we calculate the total size of the class below:

class SampleClass { char c; char* c_ptr; };

we could say that sizeof(SampleClass) = 5. HOWEVER, when we compile the code, we easily see that sizeof(SampleClass) = 8.

So the question is "where is the problem with calculation?" :S

Language: C++ Compiler: gcc 4.4.0 OS: Tinycore

+4  A: 

Compilers usually add padding to structures to align them on word boundaries (because accessing word-aligned locations requires fewer memory accesses and hence is faster).

So even though the char takes only 1 byte, c_ptr is shifted to the next 4-byte boundary, hence the result of 8 bytes.

casablanca
it now makes sense, but why it just takes 1 byte when there is only `char` exits as a member? why wouldn't compiler takes it upto 4 bytes if access is more faster?
sizeofProb
It's not that accessing 4 bytes is faster, it's just that accessing memory locations that are multiples of 4 bytes (on a 32-bit system) is faster. When there's only 1 char, the size is 1 but that 1 character will still be aligned on a 4-byte boundary.
casablanca
got it, thx :) but it wouldn't be wrong to say that sizeof the class is 5 instead of 8 when the compiler features are shut down, right?
sizeofProb
Yes, in fact, there is a directive to do just that: [`#pragma pack`](http://gcc.gnu.org/onlinedocs/gcc/Structure_002dPacking-Pragmas.html)
casablanca
well, this source is very helpful to me. thank you :)
sizeofProb
@casablanca If char would be aligned to 4 bytes, then you couldn't have an array of chars.
Let_Me_Be
@Let_Me_Be: What makes you think so? It would be a waste of memory, but apart from that, there is nothing that stops chars from being aligned to 4 bytes or any higher number.
casablanca
@casablanca: the alignment requirement of an object is expressed in terms of multiples of the size of a `char`. Since arrays occupy contiguous memory, the alignment requirement of a `char` necessarily is 1, which is its size. Now, "1" might represent 32 bits of memory, which *you* might call "4 bytes", but as far as that C++ environment is concerned, if it's the space a `char` occupies, then it's 1 byte.
Steve Jessop
I tested a class that has an array of char with size of 10 and a charptr as members and the size of it was 14 when pack(1) is used and 16 when pack is not used. so, we simply say that only the whole size of class is padded to the upper 4-byte format, right?
sizeofProb
@Steve: I guess that's what Let_Me_Be meant and I misunderstood that comment. I was talking about such a general possibility rather than C's interpretation of sizes.
casablanca
@casablanca: yeah, I think what you originally said, "[to get best speed] when there's only 1 char, that 1 char will still be aligned on a 4-byte boundary" is true. The compiler is free to 4-align `char` automatic variables, classes containing only a single `char`, and so on, if helps on given hardware. It just can't make 4 an actual alignment *requirement* of char.
Steve Jessop
@sizeofProb: Not really, it depends on the contents of the structure. Different data types have different alignment requirements.
casablanca
It's not *just* about performance - on some architectures you will generate an exception (i.e. crash) if you attempt to read/write misaligned data (e.g. PowerPC).
Paul R
@sizeofProb: No, it is not the whole size of the object that is bumped to the next alignment boundary, but rather each element in the class. Consider `struct S { char a; short int b; short int c; int d; char e; int f; };` The compiler can add padding between each member variable if it wishes (and in many cases it will for the previous structure in all but the `char` variables). A compiler could generate code equivalent to `struct S1 { int8_t a; int8_t __unnamed1; int16_t b; int16_t c; int16_t __unnamed2; int32_t d; int8_t e; int8_t __unnamed3[3]; int32_t f; }`
David Rodríguez - dribeas
+3  A: 

This is caused by padding.
The compiler is adding padding:

  • to make access to members as fast as possible
  • also to make arrays of the object pack so that access to elements effecient.

So objects that have a size of 1 can be aligned to 1 byte boundaries and still be easy/efficient to read. While objects of size of 4 need to be aligned on 4 byte boundaries (as appropriate to your compiler (technically you can align to 1 byte boundaries but this means you usually need multiple instructions to extract and combine and thus it is more efficient to write to 4 byte boundaries)).

Thus for optimum alignment of structures it is best to order the members by size (largest first) This will give you the optimum packing strategy in most normal situations.

This will not stop your object being eight bytes though.
As the compiler is also taking into account that your class may be used in arrays. Thus each element in the array needs to be aligned so that the largest member of each element is aligned appropriately.

Martin York