views:

344

answers:

5

I recently discovered that something compiles(not sure that it's legal though). My need for such a thing comes from this: My project outputs machine code for a selected arch.(which may or may not be the same arch. as the one running the program). So, I would like to support up to 64bit architectures right now(while also supporting existing 32 and 16 bit archs.) My current solution is for new_state's 'base' to just be a uint64_t, and manually casting to 16 and 32 bits as needed. Though, I discovered you can compile unions in a function parameter. So a function as this compiles:

int pcg_new_state(pcg_state *s,int arch,void *mem,int sz,
    union{
 uint16_t b16;
 uint32_t b32;
 uint64_t b64;
}base ,int self_running);

Is this kind of thing at all "legal" though or supported by any other compilers? And also, I can't figure out how to call this function without also creating a union and then passing this union to new_state.

+1  A: 

I did a quick browse of the (Edit: C++) standard and didn't see anything prohibiting it in either 8.3.5 - Functions [dcl.fct] or 9.5 - Unions [class.union]. As a semi-educated guess, I'd say it's legal to pass a union, but not to declare one inline like that. GCC gives:

error: types may not be defined in parameter types.

So you'd have to define the type ahead of time.

Even if it's legal, however, doesn't mean it a good idea. Just looking over it, I'd suggest that maybe overloading could provide a better solution. But of course you know your code best... just thinking you might want to look for a simpler and more idiomatic solution.

Dan Olson
overloading is not possible with integer types like that, and I'm using C not C++, so that's thrown out the window anyway.
Earlz
I think that is perfectly valid C. But i think you cannot pass an argument (how would you create an union of the correct type?) - unless you pass a null pointer and the parameter is a pointer.
Johannes Schaub - litb
Thinking about it, this looks like valid C code to me: *TU 1*: `void f(union A { int a; char b; } c) { ... }` and in *TU 2*: `union A { int a; char b; }; void f(union A c); int main(void) { union A a = { 1 }; f(a); }` - both unions have compatible types.
Johannes Schaub - litb
Why not post that as answer, so we could actually read it? I had to paste it into vim and then edit it to work out what was going on. And at the end of the day you create a named instancce of a named union, which the OP (for reasons opaque to me) seems to want to avoid.
anon
You can declare types inside parameter lists in C? That's really interesting... sure does give Vim's syntax highlighting a hard time though.
Dan Olson
@Neil, i generally avoid posting answers that i'm not really sure about. :) Too bad seeing ones own post being downvoted -.-
Johannes Schaub - litb
@litb Yeah, I can see you would be worrying about your tenuous grip on the SO first page :-)
anon
But seriously folks, are we agreed there is no way of using an ANONYMOUS union in the way the OP wants?
anon
A: 

How about this:

typedef struct _base
{
 union
 {
  uint16_t b16;
  uint32_t b32;
  uint64_t b64;
 };
}Base;

int func(Base b)
{
 return 0;
}

int _tmain(int argc, _TCHAR* argv[])
{

 Base b;
 b.b16 = 0xFFFF;
 func(b);

 return 0;

}
Indeera
A: 

You can get this to compile in GCC by providing the union with a name in the function, but you wouldn't be able to use it because of the scope of the defintion, as GCC warns:

test_union.c:14: warning: ‘union X’ declared inside parameter list
test_union.c:14: warning: its scope is only this definition or declaration, which is probably not what you want

Code:

int pcg_new_state(pcg_state *s,int arch,void *mem,int sz,
                         union X{
        uint16_t b16;
        uint32_t b32;
        uint64_t b64;
}base ,int self_running);
nagul
+5  A: 

To summarize: Yes, that is valid in C, although being illegal in C++. The latter contains this note which explains the difference

Change: In C++, types may not be defined in return or parameter types. In C, these type definitions are allowed

Example:

void f( struct S { int a; } arg ) {} // valid C, invalid C++
enum E { A, B, C } f() {} // valid C, invalid C++
  • Rationale: When comparing types in different compilation units, C++ relies on name equivalence when C relies on structural equivalence. Regarding parameter types: since the type defined in an parameter list would be in the scope of the function, the only legal calls in C++ would be from within the function itself.
  • Effect on original feature: Deletion of semantically well-defined feature.
  • Difficulty of converting: Semantic transformation. The type definitions must be moved to file scope, or in header files.
  • How widely used: Seldom. This style of type definitions is seen as poor coding style.

The structural equivalence in C is done by the concept of "type compatibility". This allows C to treat many types as if they were identical, even though they are theoretically distinct - because they are declared in two different translation units. In C++, this concept doesn't exist, because types have linkage and are matched to the same entity (i.e to allow member functions to link against each other).

Note that the above cited explanation is based off C89, which did not consider the tag name of a struct in determining type compatibility. In a C89 draft, the relevant text reads as the following:

Moreover, two structure, union, or enumeration types declared in separate translation units are compatible if they have the same number of members, the same member names, and compatible member types; for two structures, the members shall be in the same order;

In C99, type checking is more stricter: If one struct has a tag name, the other struct declaration has to have that same tag name. So in your unnamed union type case, to declare a function in another TU that has a compatible type, you would need an unnamed union again if you want to have valid C99 code (without undefined behavior) - you cannot "trick" around, and use a named union in one TU, and an unnamed union in another TU. It looks to me that this "trick" is valid for C89, though. C99 TC3 6.2.7/1:

Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are complete types, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types, and such that if one member of a corresponding pair is declared with a name, the other member is declared with the same name. For two structures, corresponding members shall be declared in the same order.


The way you want to do it doesn't work. Calling a function will convert the arguments to the type of the parameters as if by normal assignment.

So for this to work, you will have to have an argument that's compatible with the parameter type. For two unions declared in the same translation unit, this means that their type must equal - that's the only way you can come up with a compatible type within the same translation unit. But this cannot work, because the declaration of the unnamed union creates an unique new type - no way to "refer back" to it using another declaration.

So, to summarize - you have to give the union type a name. To avoid creating a separate variable to pass the needed base argument, I would declare it outside the function, and create functions that give back an union you may pass over

union base_type {
        uint16_t b16;
        uint32_t b32;
        uint64_t b64;
};

int pcg_new_state(pcg_state *s,int arch,void *mem,int sz,
                  union base_type base,int self_running);

union base_type base_b16(uint16_t t) 
{ union base_type b; b.b16 = t; return b; }
union base_type base_b32(uint32_t t) 
{ union base_type b; b.b32 = t; return b; }
union base_type base_b64(uint64_t t) 
{ union base_type b; b.b64 = t; return b; }

Now, it can look like the following

pcg_new_state(...., base_b32(4211), ....);
Johannes Schaub - litb
+1 But you are creating a named union instance, albeit in the functions. My understanding (which may very well be wrong) was that the OP didn't want to create an instance anywhere but in the parameter list (where it would be nameless). And I don't think that's possible.
anon
Yeah right, i wanted to propose an "alternative" to his way :) I thought the main reason for him not to like it was the boilerplate code to create a local variable. With a wrapper function, there isn't a need for it anymore :)
Johannes Schaub - litb
yes, that is an acceptable alternative for me.. I just don't want to resort to mass quantities of (uint32_t) or temporary variables, and also for the calling to be convenient to the user of the function. And no performance hit by function calls cause these are simple enough to be inline.
Earlz
If the compiler supports C99, one could get rid of the wrapper functions by using compound literals, eg `pcg_new_state(...., (union base_type){ .b32 = 4211 }, ....);`
Christoph
A: 

The size of the union will be the size of the largest member anyway - so you wouldn't be gaining anything. You might as well just make the parameter uint64_t, since conversions from larger to smaller unsigned types are well-defined and typically cheap to implement. (Eg. assigning a uint64_t to a uint16_t can be done just by taking the lowest-order 16 bits of the wider type).

EDIT: For example

int pcg_new_state(pcg_state *s,int arch,void *mem,int sz, uint64_t param_base ,int self_running)
{
    /* Implementation for 16 bit arch */
    uint16_t base = param_base;

    /* ... more code ... */
}
caf
yes. but then everything must be casted to uint**_t or a temporary variable made. Both of which are fugly if you have very much code in the function. And it don't you think it's expensive for say a 16bit arch to constantly cast a 64bit int to a 16bit int?(take the 8086, it barely has support for 32bit number conversions in the opcode set)
Earlz
Pulling a uint16_t out of a union that includes a uint64_t is no more work for the processor than converting a uint64_t to a uint16_t - that's my point.
caf