views:

424

answers:

7

Possible Duplicate:
Difference between a Structure and a Union in C

I see this code for a union in C:

    union time
    {
      long simpleDate;
      double perciseDate;
    } mytime;

What is the difference between a union and a structure in C? Where would you use a union, what are its benefits? Is there a similar construct in Java, C++ and/or Python?

+12  A: 

A union behaves as if all its members are in the same location in memory, overlapping.

This is useful when you want to represent some kind of "generic" value, or a value that can be of any of a set of types. Because the fields overlap, you can only legally access a field if you know it's been previously initialized. Still, since C doesn't check this, and many compilers emit code that allow it, it is a common trick to do ... interesting type conversions, such as:

union {
 int integer;
 float real;
} convert;

convert.real = 3.14;
printf("The float %f interpreted as an integer is %08x", convert.real, convert.integer);

For the more well-formed usage, where you keep track of what was last stored in the union, it could look for instance like this:

typedef enum { INTEGER = 0, REAL, BOOLEAN, STRING } ValueType;

typedef struct {
  ValueType type;
  union {
  int integer;
  float real;
  char boolean;
  char *string;
  } x;
} Value;

Here, note that the union is actually a field in the surrounding structure, Value. Access could look like this:

void value_set_integer(Value *value, int x)
{
  value->type = INTEGER;
  value->x.integer = x;
}

This records that the current contents of the union is an integer, and stores the given value. A function to e.g. print a Value can inspect the type member, and do the right thing:

void value_print(const Value *value)
{
  switch(value->type)
  {
  case INTEGER:
    printf("%d\n", value->x.integer);
    break;
  case REAL:
    printf("%g\n", value->x.real);
    break;
  /* ... and so on ... */
  }
}

There is no equivalent in Java. C++, being sort of almost a super-set of C, has the same functionality. It even "one-ups" C's implementation, and allows anonymous unions. In C++, the above could have written without naming the inner union (x), which would make the code quite a lot shorter.

unwind
Is overlapping have problem ?
SjB
The problem with the overlapping is that you don't know what was written to it last. They're dangerous to use in all but a few isolated instances, but they are very handy then.
San Jacinto
+1 for the verbiage "overlapping" which is probably the most elegant way to say what a union is.
Doug T.
+10  A: 

So in your example, when I allocate time:

int main()
{
    time t;
}

The compiler can interpret the memory at &t as if it is either a long:

t.simpleDate;

or as if its a double:

t.perciseDate;

So if the raw hex of the memory at t looks like

0x12345678;

That value can be "parsed" as either a double or long, depending on how its accessed. So for it to be useful you have to know how a long and a double are going to be packed & formatted exactly in memory. For example, a long is going to be a 2-s complement signed integer, which you can read about here. you can learn how a double is formatted in binary here.

A struct, however, just groups separate variables, with distinct address spacing into one block of memory.

(Note your example might be dangerous as sizeof(long) could be 32 bits whereas sizeof(double) is always 64 bits)

Unions are commonly used when you want a "raw" representation (like a char array) and a "message" representation. For example a message that is to be sent over a socket:

struct Msg
{
   int msgType;
   double Val1;
   double Val2;
}; // assuming packing on 32-bit boundary

union
{
   Msg msg;
   unsigned char msgAsBinary[20];
};

Hope that helps.

Doug T.
This is a much better answer than the other one currently posted.
San Jacinto
Steve Jessop
+1  A: 

A union can be used to store any one of its members, but (unlike a struct) no more than one at the same time. You can think of it as containing enough space to store the largest of its members, and re-using the same storage for whichever member you actually assign a value to.

C++ also has unions. Java doesn't. Object members in Python work completely differently from C, they're stored in a dictionary rather than laid out consecutively in memory. I don't know whether Python has some handy library class somewhere that acts a bit like a union, but it's not fundamental to the object like it is in C.

Steve Jessop
+1  A: 

With a structure each data item has its own memory location, but with unions only one item is used at a time, and memory allocated for each item is in shared memory. Only one memory location will be shared by the data items of a union. The size of a union will be the size of the biggest variable.

This can be beneficial because sometimes we may not need the data of all the (related) data items of a complex data structure and be storing/accessing only one data item at a time. Union helps in such scenarios.

MarkPowell
+2  A: 

A union lets you interpret one memory location (raw, binary value) in several different ways.

An example I've actually used, is accessing the individual bytes of a uint32.

union {
  uint32 int;
  char bytes[4];
} uint_bytes;

What a union offers, is multiple ways of accessing (parts of) the same memory.

The size of a union type is equal to the size of the largest type in the union.

gnud
+2  A: 

A union is a space-saving way of storing "one of" several different types. It does not provide a mechanism for rediscovering the type that was stored in it; this must be determined out-of-line. Technically, accessing the "wrong" type (i.e. one that was not initialized) in a union results in undefined behaviour; in practice, it usually results in a bit-level cast, and is often used as a way of doing just that.

While the "union" type is in C++ (C++ being a superset of C), most C++ types cannot safely be stored in one (specifically, a union can hold only POD types, i.e. classes with default copy constructor, default destructor and no virtual methods). If you want a space-saving, stack-based equivalent to a union in C++, capable of storing complex objects, try Boost.Variant.

In other languages less concerned with stack allocation, polymorphism does the job of a union. In Java, everything inherits from an Object, so Object* can be used to represent any object; or you can use a common superclass or an interface to restrict the set of objects to ones supporting a particular set of operations.

In Python, any variable can hold any object, so in some sense all variables are unions. You generally should not need to determine the type stored in a variable; instead, use duck typing -- that is, look for the methods it supports rather than the type/interface it implements.

chrispy
Examples on how to do bit-level casts in Java and Python (struct module) would make it the most complete answer.
Denis Otkidach
Since a union strictly shouldn't be used for bit-level casts, I'd say I would no longer be answering the question ;)
chrispy
+1  A: 

only one member of a union can be used at a time unlike struct where all the members reside in the memory together.for union space is allocated of size of the longest element it contains.

pushkarpriyadarshi