views:

75

answers:

3

Unions aren't something I've used that often and after looking at a few other questions on them here it seems like there is almost always some kind of caveat where they might not work. Eg. structs possibly having unexpected padding or endian differences.

Came across this in a math library I'm using though and I wondered if it is a totally safe usage. I assume that multidimensional arrays don't have any extra padding and since the type is the same for both definitions they are guaranteed to take up exactly the same amount of memory?

template<typename T> class Matrix44T
{
    ...

    union
    {
        T M[16];
        T m[4][4];
    } m;
};

Are there any downsides to this setup? Would the order of definition make any difference to how this works?

A: 

No. This seems just fine usage of union under your assumptions.

I would have chosen better names and not m and M but other than that it is a nice usage for union.

brickner
Just that the assumptions have nothing to do with how a union has to be used.
Ingo
If your assumption was wrong - an 4x4 array was of different size than an 16 vector array then it wouldn't have worked.
brickner
+1  A: 

If you play by the rules, padding and endian differences won't hurt you.

Look at this code

union { int a; float b; } wrong;

wrong.a = 1;
printf("%f", wrong.b);

This is wrong because if you have written member "a", then any reading except from "a" is undefined.

To sum this up: Whether a union is safe cannot be told. It's not the definition, that is unsafe, it's how it is being used.

Ingo
@Ingo: So technically my example is undefined behavior then? At least to write to M and then read from m - if someone wrote a strict compiler it wouldn't work for example.
identitycrisisuk
@Ingo, Why is it undefined? If the union was an array of 4 bytes and an int it is legitimate to write as int and read as 4 bytes (with endianity considerations).What does this has to do with the question?
brickner
@brickner - no its not - an integer may have 2 bytes or 4, or 8. Or even 7? SHow me the standard where it says that an int must be 4 bytes. (This is not java, mind you).
Ingo
@identitycrisisuk - it may still work, it's just not that you can rely on that it will.
Ingo
@Ingo: OK, int32_t and uint32_t
brickner
@brickner - and now tell me where it says that implementations MUST place different union members in the very same memory?No, a union is not a device for magic typechanges, though it is often used that way. Nevertheless, it remains undefined.
Ingo
The GNU C Programming Tutorial (http://crasseux.com/books/ctutorial/union.html#union)."the space allocated for it is the space taken by its largest member"
brickner
@brickner: Implementation specific tutorial? GNU CC is fine, I use it, too. But it's not a standard body. Please give up, the discuccion is pointless. See Andreas answer.
Ingo
I don't understand why use offensive terms like "Please give up". @Andreas says "In practice I think this will work fine on all compilers though.".
brickner
@brickner Yes, but it's still technically UB.
Andreas Brinck
I didn't mean to be offensive. It's just that we discuss different things. Me: is it "comletely safe" (original question), you: will it work on most platforms regardless. I do NOT deny the latter.
Ingo
+4  A: 

Although I do exactly the same in my Matrix-class I think this is implementation dependent, reading the standard to the letter:

Standard 9.5.1:

In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time. [Note: one special guarantee is made in order to simplify the use of unions: If a POD-union contains several POD-structs that share a common initial sequence (9.2), and if an object of this POD-union type contains one of the POD-structs, it is permitted to inspect the common initial sequence of any of POD-struct members; see 9.2. ]

The question then is do m and M share a common initial sequence, to answer this we look at 9.2/15:

Two POD-union (clause 9) types are layout-compatible if they have the same number of nonstatic data members, and corresponding nonstatic data members (in any order) have layout-compatible types (3.9).

After reading this the answer seems to be, no m and M are not layout-compatible in the strict sense of the word.

In practice I think this will work fine on all compilers though.

Andreas Brinck
@Andreas - thanks for the standard definitions. So I guess the only thing absolutely guaranteed to work would be two structs with identical type contents but different naming?
identitycrisisuk
@identitycrisisuk In a word: yes.
Andreas Brinck
Ingo
Ingo
Andreas Brinck
@Andreas: I am pretty sure that the compiler MUST layout array elements contiguosly. But before arguing, we should perhaps clarify about what language/version exactly we are talking here? :)
Ingo
An array must layout elements contiguously, according to their stride. That is, if `strideof(T) == 2`, then `void* p = M[0]` is 2 bytes before `void* q = M[1]`. However, the stride is implementation dependent, and notably depend upon the targeted architecture. Therefore, a compiler could decide that the stride of `T` is 4 and the stride of `T[4]` is 20, thus effectively preventing layout compatibility of `T[16]` and `T[4][4]`.
Matthieu M.
@Ingo yes, but the elements will be contiguously laid out as long as the (theoretical) compiler in my example also guarantee that `sizeof(float[4]) == 256`. Contrary to common belief, arrays are a distinct type.
Andreas Brinck
@Matthieu M. At first I was afraid you we're going to disagree with me, but I see now that we're in total agreement, what a relief :)
Andreas Brinck
@Andreas: you're right more often than I am, so I would think twice and google (or should I say bing ?) a bit before contradicting you :p
Matthieu M.
Ingo
Andreas Brinck
@Andreas if you're right, and I have not much doubt you are, then this is just another reason for the intended use of the union (namely, as a magic type caster) being not valid. Have a nice weekend!
Ingo
@Ingo As you say type casting is not a valid usage for unions. Have a nice weekend too!
Andreas Brinck
Johannes Schaub - litb
@Johannes Is there really anything in the standard that explicitly states that for instance `sizeof(int[4]) == sizeof(int) * 4`?
Andreas Brinck
@Johannes I just read the complete 8.3.4 section of the standard and it says: "An object of array type *contains* a contiguously allocated nonemptyset of N sub-objects of type T." In other words, there's nothing preventing the compiler from adding padding either at the start or end of the *array* object.
Andreas Brinck
@Andreas Brinck, sure there is wording that requires it. It is an *elementary* guarantee that the elements of an array are *not* padded. For this reason, structures contain internal padding such that they can be laid out next to each other in memory. `5.3.3/2` says about the `sizeof` operator: "When applied to an array, the result is the total number of bytes in the array. This implies that the size of an array of n elements is n times the size of an element.". This is all the reason why `sizeof(array) / sizeof(array[0])` works like it should.
Johannes Schaub - litb
Now if the compiler could pad arrays on their start or end, `int[4][4]` wouldn't be `4 * sizeof(int[4])` anymore.
Johannes Schaub - litb
Andreas Brinck
@Andreas i think we are both talking about the same thing. Doesn't seem that either one is misunderstanding the other one. Now i see we have come to a solution. Great! :)
Johannes Schaub - litb