tags:

views:

138

answers:

5

I'm hoping that the elements are 1 byte aligned and similarly that a std::vector<int> is 4 byte aligned ( or whatever size int happens to be on a particular platform ).

Does anyone know how standard library containers get aligned?

+1  A: 

Do you mean the vector members, or the vector structure itself? Members are guaranteed to be contiguous in memory but structure alignment is platform/compiler-dependent. On Windows this can be set at compile-time and also overridden using #pragma pack().

The answer for other containers is likely not the same as for vector, so I would ask specific questions about the ones you care about.

Steve Townsend
+8  A: 

The elements of the container have at least the alignment required for them in that implementation: if int is 4-aligned in your implementation, then each element of a vector<int> is an int and therefore is 4-aligned. I say "if" because there's a difference between size and alignment requirements - just because int has size 4 doesn't necessarily mean that it must be 4-aligned, as far as the standard is concerned. It's very common, though, since int is usually the word size of the machine, and most machines have advantages for memory access on word boundaries. So it makes sense to align int even if it's not strictly necessary. On x86, for example, you can do unaligned word-sized memory access, but it's slower than aligned. On ARM unaligned word operations are not allowed, and typically crash.

vector guarantees contiguous storage, so there won't be any "padding" in between the first and second element of a vector<char>, if that's what you're concerned about. The specific requirement for std::vector is that for 0 < n < vec.size(), &vec[n] == &vec[0] + n.

[Edit: this bit is now irrelevant, the questioner has disambiguated: The container itself will usually have whatever alignment is required for a pointer, regardless of what the value_type is. That's because the vector itself would not normally incorporate any elements, but will have a pointer to some dynamically-allocated memory with the elements in that. This isn't explicitly required, but it's a predictable implementation detail.]

Every object in C++ is 1-aligned, the only things that aren't are bitfields, and the elements of the borderline-crazy special case that is vector<bool>. So you can rest assured that your hope for std::vector<char> is well-founded. Both the vector and its first element will probably also be 4-aligned ;-)

As for how they get aligned - the same way anything in C++ gets aligned. When memory is allocated from the heap, it is required to be aligned sufficiently for any object that can fit into the allocation. When objects are placed on the stack, the compiler is responsible for designing the stack layout. The calling convention will specify the alignment of the stack pointer on function entry, then the compiler knows the size and alignment requirement of each object it lays down, so it knows whether the stack needs any padding to bring the next object to the correct alignment.

Steve Jessop
@Steve, thanks. I edited the question title in response. "Every object in C++ is 1-aligned" - sorry, I may not understand alignment correctly - but how can the vector container be 4-aligned then?
BeeBand
Good, thanks. Then the answer is indeed 1, which is the only alignment `char` can have in a C++ implementation. The way container elements get aligned is what I said about allocating memory from the heap. To be specific, the standard puts requirements on `malloc` and `new` regarding the alignment of the pointer returned, and on the template parameter `Allocator` of standard containers.
Steve Jessop
@BeeBand: something is X-aligned if its address is divisible by X. So every address is 1-aligned. If something is 8-aligned then it is also 4-aligned, and so on. To specify that something is 4-aligned but not necessarily 8-aligned, you could say that it's "only 4-aligned". To say that something is 4-aligned and definitely not 8-aligned, you could say that it's "4-aligned but 8-misaligned".
Steve Jessop
@Steve Jessop. Aha. Yes of course. Ok - so going back to the `vector<char>`. Are you saying that the addresses of both the vector, and its first element are 4-aligned, and the remaining vector elements are 1-aligned?
BeeBand
@Steve Jessop, ah ok, I just re-read your answer. What you said about padding implies that the remaining elements are 1-aligned in that case.
BeeBand
@BeeBad: in practice, if everything eventually comes down to `malloc` or some other OS-specific allocation function, and if there's a 4-aligned type in your implementation, then every allocation will be 4-aligned (8 on linux, I think), so the first element of a `vector<char>` will be too. I'd have to hit the book to be sure exactly what is guaranteed - I think it's probably legal for the first element not to be 4-aligned, but I doubt that any implementation would do that unless there are no alignment requirements whatsoever on the system. Maybe not even then.
Steve Jessop
@Steve Jessop, where's the 4-aligned type in a `vector<char>`? The `vector` itself?
BeeBand
@BeeBand: when I say, "If there's a 4-aligned type in your implementation", I mean if there exists in your compiler a type which your compiler 4-aligns. Since `malloc` is a general-purpose allocator, it will cover the worst, most-aligned case (excluding SIMD types, usually). "Implementation" means compiler + OS, more or less.
Steve Jessop
@Steve Jessop. Ok. Are you saying that the address of the first element of the `vector<char>` is 4 aligned and the subsequent elements are 1 aligned?
BeeBand
@Steve: "but will have a pointer to some dynamically-allocated memory with the elements in that." That's incorrect, for example Visual Studio 2010's implementation has a small buffer (of about 16 bytes) to avoid heap allocation for very small vectors.
Matthieu M.
@BeeBand: I'm saying that's probably true of your implementation.
Steve Jessop
@Matthieu: such small vectors are what I meant by "not normal", but yes with that implementation vectors might need their element-type alignment, as well as pointer-alignment. Or perhaps max-alignment for simplicity.
Steve Jessop
@Steve: not necessarily. The internal buffer is raw (generally `char[]`) and therefore does not influence the alignment. There is however the issue of putting the objects at the correct alignment within this raw memory area, but that's typical of low-level allocation strategies (and what intptr_t is for).
Matthieu M.
@Matthieu: Oh right, I was thinking you could do this at compile-time by making a union of the `char[]` and the value_type (or of `value_type[]` with whatever goes in there when there is an external buffer). To be honest if it's a 16 byte buffer, and there's any need to manually align within it, then you might only fit a single 8-byte object in there. Personally I'd want to give up at that point, and only use the internal buffer when the value_type's alignment is no bigger than the alignment the buffer has naturally, probably 4 on Win32.
Steve Jessop
@Steve: I haven't look that deep, only looked up the actual attributes, but I guess it's only used with "small" types :)
Matthieu M.
A: 

The alignment of the whole container is implementation dependent. It is usually at least sizeof(void*), that is 4 or 8 bytes depending on the platform, but may be larger.

drhirsch
Or smaller.....
Yossarian
+2  A: 

I'm hoping that the elements are 1 byte aligned and similarly that a std::vector is 4 byte aligned ( or whatever size int happens to be on a particular platform ).

To put it simply, std::vector is a wrapper for a C array. The elements of the vector are aligned as if they were in the array: elements are guaranteed to occupy continues memory block without any added gaps/etc, so that std::vector<N> v can be accessed as a C array using the &v[0]. (Why vector has to reallocate storage sometimes when elements are added to it.)

Does anyone know how standard library containers get aligned?

Alignment of elements is platform specific but generally a simple variable is aligned so that its address is divisible by its size (natural alignment). Structures/etc are padded (empty filler space at the end) on the largest data type they contain to ensure that if the structure is put into an array, all fields would retain their natural alignment.

For other containers (like std::list or std::map) use data via template mechanics are made a part of internal structure and the structure is allocated by operator new. The new is guaranteed (custom implementation must obey the rule too; inherited from the malloc()) to return memory block which is aligned on largest available primitive data type (*). That is to ensure that regardless what structure or variable would be places in the memory block, it will be accessed in aligned fashion. Unlike std::vector, obviously, the elements of most other STL containers are not guaranteed to be within the same continuous memory block: they are newed one by one, not with new[].

(*) As per C++ standard, "The allocation function (basic.stc.dynamic.allocation) called by a new-expression (expr.new) to allocate size bytes of storage suitably aligned to represent any object of that size." That is a softer requirement compared to one malloc() generally abides, as per POSIX: "The pointer returned if the allocation succeeds shall be suitably aligned so that it may be assigned to a pointer to any type of object [...]". C++ requirement in a way reenforces the natural alignment requirement: dynamically allocated char would be aligned as char requires, but not more.

Dummy00001
@Dummy00001, So I guess your last point is saying that the elements in other STL containers, (not `std::vector` ) are still aligned on their natural alignment, despite being `new` ed one by one - have I understood you right?
BeeBand
@BeeBand: Yes. From C++ std, 18.4.1.1: "The allocation function (basic.stc.dynamic.allocation) called by a new-expression (expr.new) to allocate size bytes of storage suitably aligned to represent any object of that size." N.B. It appears that C++ has relaxed the rule, generally followed by the `malloc`, as per POSIX: "The pointer returned if the allocation succeeds shall be suitably aligned so that it may be assigned to a pointer to any type of object [...]"
Dummy00001
A: 

If special (guaranteed) alignment is needed use plain arrays or write/adapt some generic array class with:

// allocation
char* pointer = _mm_malloc(size, alignment);
// deallocation
_mm_free(pointer);
tauran