views:

158

answers:

9
+3  Q: 

C++ member layout

Hello! Let's we have a simple structure (POD).

struct xyz
{
    float x, y, z;
};

May I assume that following code is OK? May I assume there is no any gaps? What the standard says? Is it true for PODs? Is it true for classes?

xyz v;
float* p = &v.x;
p[0] = 1.0f;
p[1] = 2.0f; // Is it ok?
p[2] = 3.0f; // Is it ok?
+3  A: 

No, it is not OK to do so except for the first field.

From the C++ standards:

9.2 Class members
A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [Note: There might therefore be unnamed padding within a POD-struct object, but not at its beginning, as necessary to achieve appropriate alignment.

AraK
+1  A: 

Depends on the hardware. The standard explicitly allows POD classes to have unspecified and unpredictable padding. I noted this on the C++ Wikipedia page and grabbed the footnote with the spec reference for you.

^ a b ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §9.2 Class members [class.mem] para. 17

In practical terms, however, on common hardware and compilers it will be fine.

bmargulies
+5  A: 

This is not guaranteed by the standard, and will not work on many systems. The reasons are:

  • The compiler may align struct members as appropriate for the target platform, which may mean 32-bit alignment, 64-bit alignment, or anything else.
  • The size of the float might be 32 bits, or 64 bits. There's no guarantee that it's the same as the struct member alignment.

This means that p[1] might be at the same location as xyz.y, or it might overlap partially, or not at all.

JSBangs
As JaredPar mentioned, this is not true. If you only have float members and no access specifiers, the alignment will be the same as for arrays.
MartinStettner
However, if you do know the alignments and sizes of the built-in types, the arrangement of POD structure members with no access modifiers is predictable. This is important for, e.g., declaring structures that describe memory-mapped hardware interfaces.
moonshadow
@MartinStettner: JaredPar's answer doesn't contradict this answer. He just gets a little more detailed into when it is or is not likely to work. It's perfectly possible for a system to want 8-byte alignment on 4-byte floats, for example, and the Standard certainly doesn't forbid it.
David Thornley
I agree it's not guaranteed, but I doubt it "will not work on many systems". We've used much more complex forms of this `struct` definition on several platforms, and have not seen a `float` padded to 8 bytes. The only time padding got weird was when mixing different-sized types, and we deal with that by putting in padding fields to force proper alignment. I doubt `float` will ever refer to an 8-byte IEEE 754 type because there is no support in the standard for `short float`. This is a textbook 99% case.
Mike DeSimone
+1  A: 

The standard requires that the order of arrangement in memory match the order of definition, but allows arbitrary padding between them. If you have an access specifier (public:, private: or protected:) between members, even the guarantee about order is lost.

Edit: in the specific case of all three members being of the same primitive type (i.e. not themselves structs or anything like that) you stand a pretty fair chance -- for primitive types, the object's size and alignment requirements are often the same, so it works out.

OTOH, this is only by accident, and tends to be more of a weakness than a strength; the code is wrong, so ideally it would fail immediately instead of appearing to work, right up to the day that you're giving a demo for the owner of the company that's going to be your most important customer, at which time it will (of course) fail in the most heinous possible fashion...

Jerry Coffin
+8  A: 

The answer here is a bit tricky. The C++ standard says that POD data types will have C layout compatability guarantees (Reference). According to section 9.2 of the C spec the members of a struct will be laid out in sequential order if

  1. There is no accessibility modifier difference
  2. No alignment issues with the data type

So yes this solution will work as long as the type float has a compatible alignment on the current platform (it's the platform word size). So this should work for 32 bit processors but my guess is that it would fail for 64 bit ones. Essentially anywhere that sizeof(void*) is different than sizeof(float)

JaredPar
I think it could also work on 64 bit machines (even though you don't have the guarantee the standard gives): If float widens to 64 bit everything is fine (sizeof(float) == platform word size). If float stays at 32 bit then there might be a padding of 32 bit added to the end of the structure but the floats could lay in sequential order without padding between. But this is all guessing and there still could be some weird 64 bit machine using own 64 bit for each 32 bit float :-)
rstevens
In practice this should work on any platform I can think of. As long as sizeof(float) == alignment_of(float), essentially. But it's not guaranteed by the standard, since alignment is implementation-defined. I don't think sizeof(void*) has anything to do with it.
jalf
@jalf, My understanding of alignment is limited and I tend to think of alignment in terms of pointer sizes so I'd defer to your understanding here
JaredPar
Well, as a general rule of thumb, for a primitive type T to be properly aligned, their address must be divisible by sizeof(T). Since alignment is generally a power of 2, Structs simply use the alignment of the most aligned member, so to speak. A char can be placed on any address, a short must typically be placed on an even address, and doubles on an address divisible by 8. And a struct containing one of each would also have to be placed on an address divisible by 8. But of course, virtually all of this is implementation-defined. Most platforms work like this though.
jalf
Using GCC 4 on 64-bit linux, both 32-bit and 64-bit builds of a trivial program both have identical packing. (Although it is an LP64 platform.) My understanding is that unless the width of successive members of a `struct` differ in length, packing is tight.
greyfade
Even if they differ in length, it might still use tight padding. For example, I'd expect (haven't tested this) `struct { int, char, char, char, char}` to be tightly packed too. The struct would have alignment 4 (the alignment requirement of int, which usually has sizeof(int) == 4), and the struct's size would be a multiple of 4. Again, of course, implementation-defined and all, but on common compilers and CPU's, I'd expect that to be true
jalf
@jaif: We've got a lot of code that's worked fine for a decade with `struct`s like that, even when reading or writing files or sockets. We use them a lot for control blocks passed to functions. The biggest problem we ever had with them was byte order, solved by making the first field a 2 or 4-byte known constant.
Mike DeSimone
A: 

Your code is OK (so long as it only ever handles data generated in the same environment). The structure will be laid out in memory as declared if it is POD. However, in general, there is a gotcha you need to be aware of: the compiler will insert padding into the structure to ensure each member's alignment requirements are obeyed.

Had your example been

struct xyz
{
    float x;
    bool y;
    float z;
};

then z would have began 8 bytes into the structure and sizeof(xyz) would have been 12 as floats are (usually) 4 byte aligned.

Similarly, in the case

struct xyz
{
    float x;
    bool y;
};

sizeof(xyz) == 8, to ensure ((xyz*)ptr)+1 returns a pointer that obeys x's alignment requirements.

Since alignment requirements / type sizes may vary between compilers / platforms, such code is not in general portable.

moonshadow
A: 

No, you may not assume that there are no gaps. You may check for you architecture, and if there aren't and you don't care about portability, it will be OK.

But imagine a 64-bit architecture with 32-bit floats. The compiler may align the struct's floats on 64-bit boundaries, and your

p[1]

will give you junk, and

p[2]

will give you what you think your getting from

p[1]

&c.

However, you compiler may give you some way to pack the structure. It still wouldn't be "standard"---the standard provides no such thing, and different compilers provide very incompatible ways of doing this--- but it is likely to be more portable.

Tim Schaeffer
BTW, float is unlikely to be 64 bits on *any* platform: http://en.wikipedia.org/wiki/IEEE_754-2008
Tim Schaeffer
Oh, and you *should* worry about non-portabilities. They are like unhatched fleas on a dog: they may not itch yet, but if you don't clean them now, they will, and will much more difficult to get rid of when they do.
Tim Schaeffer
On the other hand, if the platform has a vector processing unit, it may align floats such that four will fit in a single 16-byte region. (As is the case with all of the x86 compilers I've used.)
greyfade
A: 

As others have pointed out the alignment is not guaranteed by the spec. Many say it is hardware dependent, but actually it is also compiler dependent. Hardware may support many different formats. I remember that the PPC compiler support pragmas for how to "pack" the data. You could pack it on 'native' boundaries or force it to 32 bit boundaries, etc.

It would be nice to understand what you are trying to do. If you are trying to 'parse' input data, you are better off with a real parser. If you are going to serialize, then write a real serializer. If you are trying to twiddle bits such as for a driver, then the device spec should give you a specific memory map to write to. Then you can write your POD structure, specify the correct alignment pragmas (if supported) and move on.

Andrew Mellinger
A: 

When in doubt, change the data structure to suit the application:

struct xyz
{
    float  p[3];
};  

For readability you may want to consider:

struct xyz
{
    enum { x_index = 0, y_index, z_index, MAX_FLOATS};
    float p[MAX_FLOATS];

    float  X(void) const {return p[x_index];}
    float  X(const float& new_x) {p[x_index] = new_x;}

    float  Y(void) const {return p[y_index];}
    float  Y(const float& new_y) {p[y_index] = new_y;}

    float  Z(void) const {return p[z_index];}
    float  Z(const float& new_z) {p[z_index] = new_z;}
};

Perhaps even add some more encapsulation:

struct Functor
{
  virtual void operator()(const float& f) = 0;
};

struct xyz
{
  void for_each(Functor& ftor)
  {
     ftor(p[0]);
     ftor(p[1]);
     ftor(p[2]);
     return;
  }
  private:
     float p[3];
}

In general, if a data structure needs to be treated in two or more different ways, perhaps the data structure needs to be redesigned; or the code.

Thomas Matthews