tags:

views:

629

answers:

6
+3  Q: 

Unions in C

The idea behind this question is to understand the deeper concepts of using union and using it in different way so as to save memory.. My question to all is--

lets say there is structure
struct strt
{
   float f;
   char c;
   int a;
}

and the same structure represented in union 
union unin
{
   float f;
   char c;
   int a;
}

Now my question is,,

if i allocate values to structure members one after another and then print them, it gets printed..

But in case of union it doesn't happen.. Some overwriting is being done..

So i need to find out a method which can store the values of f,c,a using union and then i can print the same. (Apply any operations or anything..) but i am in search of this technique.. Can anybody out there guide or any idea...

+26  A: 

I think you misunderstand the purpose of a union.

A union, as the name suggests, defines a structure where all of its members occupy the same memory space. Whereas a struct places each of its members in separate memory in a single, contiguous area.

With your union, when you write:

union foo;
foo.c = 3;

Then foo.a and foo.f will both be changed. This is because .a, .c, and .f are stored at the same memory location. Thus, each member of a union is a different "view" of the same memory. This does not happen with a struct because all of the members are distinct and separate from each other.

There is no way around this behavior because it's intentional.

greyfade
Yes, i am clear with the idea behind this logic but i want to use union in place of structure so that less memory space is used and i can get all values (if using some operations or so..). The idea behind using union is to space memory.
AGeek
The idea behind using union is to save memory and get result equivalent to structure by doing some or any operation..
AGeek
No, the idea behind a union is to make the same memory accessible as different types. Store a float and read its bit pattern in an int, for example. You cannot simply "reduce memory usage" with it.
greyfade
The idea (in the language) is storing data in the same memory address, not data compression. Each write on an element will mutate all your members and you won't be able to recover them.
David Rodríguez - dribeas
I suppose, the above structure that i have declared uses 9 bytes in gcc and union uses only 4bytes(longest size).. so memory is automatically reduced.. but is it not possible to use union in place of structure so as to get all values printed one at a time using some operations
AGeek
No, what you want can't be done.
greyfade
Another way of explaining: Union give multiple 'meaning' to same bit pattern and location.
Vardhan Varma
You don't 'reduce' or 'save' memory, because with the struct you occupy 9 bytes, but for 3 different values/variables, with the union you occupy just 4 bytes but for only one value/variable. It's like saying that by buying one Big Mac you save more money than by buying 3 Big Macs.
Petruza
+5  A: 

A union contains a set of mutually exclusive data.

In your particular example, you can store the float (f), char (c) or int (a) in the union. However, memory will only be allocated for the largest item in the union. All items in the union will share the same portion of memory. In other words, writing one value into the union followed by another will cause the first value to be overwritten.

You need to go back and ask yourself what you are modelling:

  • Do you truly want the values of f, c and a to be mutually exclusive (i.e. only one value can exist at once)? If so, consider using a union in conjunction with an enum value (stored outside the union) indicating which member in the union is the "active" one at any particular point in time. This will allow you to get the memory benefits of using a union, at the cost of more dangerous code (as anyone maintaining it will need to be aware that the values are mutually exclusive - i.e. it is indeed a union). Only consider this option if you are creating many of these unions and memory conservation is vital (e.g. on embedded CPUs). You may NOT even end up saving memory because you will need to create enum variables on the stack which will take up memory too.

  • Do you want these values to be simultaneously active and not interfere with each other? If so, you will need to use a struct instead (as you put in your first example). This will use more memory - when you instantiate a struct, the memory that is allocated is the sum of all members (plus some padding to the nearest word boundary). Unless memory conservation is of paramount importance (see previous example), I would favour this approach.

Edit:

(Very simple) example of how to use enums in conjunction with a union:

typedef union
{
    float f;
    char c;
    int a;
} floatCharIntUnion;

typedef enum
{
    usingFloat,
    usingChar,
    usingInt
} unionSelection;

int main()
{
    floatCharIntUnion myUnion;
    unionSelection selection;

    myUnion.f = 3.1415;
    selection = usingFloat;
    processUnion(&myUnion, selection);

    myUnion.c = 'a';
    selection = usingChar;
    processUnion(&myUnion, selection);

    myUnion.a = 22;
    selection = usingInt;
    processUnion(&myUnion, selection);
}

void processUnion(floatCharIntUnion* myUnion, unionSelection selection)
{

    switch (selection)
    {
    case usingFloat:
        // Process myUnion->f
        break;
    case usingChar:
        // Process myUnion->c
        break;
    case usingInt:
        // Process myUnion->a
        break;
    }
}
LeopardSkinPillBoxHat
A boolean value outside the union would only work for a union with two variables in it...
DeadHead
Yep, I realised that when editing it. Changed it to an enum.
LeopardSkinPillBoxHat
How do i use enum.. can u give me one example..
AGeek
I added some example code of using enums.
LeopardSkinPillBoxHat
thnx a lot,, but it was containing some errors when i executed it in turbo c
AGeek
If the code above is executed, only the last case "usingInt" gets executed always. Putting the switch case block in some function and calling this function, after every assignment to union members, would be better for explanation purpose. IMO :)
xk0der
@xk0der - yeah it was a very basic example to get the idea across. Anyhow, I updated the code as per your suggestions.
LeopardSkinPillBoxHat
@Young - The code I put in was untested. What errors were you seeing?
LeopardSkinPillBoxHat
A: 

Unions are usually used when only one of the below would be stored in an instance at any given point of time. i.e. you can either store a float, a char or an int at any instant. This is to save memory - by not allocating extra/distinct memory for a float and an int, when you are just going to use it to store a char. The amount of memory allocated = largest type in union.

union unin
{
   float f;
   char c;
   int a;
}

The other use of union is when you want to store something that has parts, let sat you may want to model a register as a union containing the upper byte, lower byte and a composite value. So you can store a composite value into the union and use the members to get the pieces via the other members.

Gishu
+23  A: 

If you were to look at how a struct stores its values, it would be something like this:

|0---1---2---3---|4---|5---6---7---8---|
|ffffffffffffffff|    |                | <- f: Where your float is stored
|                |cccc|                | <- c: Where your char is stored
|                |    |aaaaaaaaaaaaaaaa| <- a: Where your int is stored

So when you change the value of f, you are actually changing bytes 0-3. When you change your char, you are actually changing byte 4. When you change your int, you are actually changing bytes 5-8.

If you now look at how a union stores its values, it would be something like this:

|0---1---2---3---|
|ffffffffffffffff| <- f: where your float is stored
|cccc------------| <- c: where your char is stored
|aaaaaaaaaaaaaaaa| <- a: where your int is stored

So now, when I change the value of f, I am changing bytes 0-3. Since c is stored in byte 0, when you change f, you also change c and a! When you change c, you're changing part of f and a - and when you change a, you're changing c and f. That's where your "overwriting" is happening. When you pack the 3 values into the one memory address, you're not "saving space" at all; you're just creating 3 different ways of looking at and changing the same data. You don't really have an int, a float, and a char in that union - at the physical level, you've just got 32 bits, which could be viewed as an int, a float, or a char. Changing one is meant to change the others. If you don't want them to change each other, then use a struct.

This is why gcc tells you that your struct is 9 bytes long, while your union is only 4 - it's not saving space - it's just that structs and unions are not the same thing.

Smashery
As to your disclaimer, the byte would have to be at 0, or else it wouldn't exist at the same address as the float and int. It doesn't matter about endianness because a little-endian and big-endian float or int take up the same amount of space but would still start at the same address as the char.
dreamlax
Good call. I've removed it now. Thanks!
Smashery
Quite Good Explanation :)
mahesh
@mahesh - Thanks :-)
Smashery
+1  A: 

This is a classic example of using a union to store data depending on an external marker.

The int, float and char * all occupy the same place in the union, they are not consecutive so, if you need to store them all, it's a structure you're looking for, not a union.

The structure is the size of the largest thing in the union plus the size of the type, since it's outside the union.

#define TYP_INT 0
#define TYP_FLT 1
#define TYP_STR 2

typedef struct {
    int type;
    union data {
        int a;
        float b;
        char *c;
    }
} tMyType;

static void printMyType (tMyType * x) {
    if (x.type == TYP_INT) {
        printf ("%d\n", x.data.a;
        return;
    }
    if (x.type == TYP_FLT) {
        printf ("%f\n", x.data.b;
        return;
    }
    if (x.type == TYP_STR) {
        printf ("%s\n", x.data.c;
        return;
    }
}

The printMyType function will correctly detect what's stored in the structure (unless you lie to it) and print out the relevant value.

When you populate one of them, you have to do:

x.type = TYP_INT;
x.data.a = 7;

or

x.type = TYP_STR;
x.data.c = "Hello";

and a given x can only be one thing at a time.

Woe betide anyone who tries:

x.type = TYP_STR;
x.data.a = 7;

They're asking for trouble.

paxdiablo
+4  A: 

I think you are misunderstanding Unions.

The idea behind using unions is toe save memory...

yes, that's one reason

... and get result equivalent to structure...

no

it's not equivalent. They looks similar in source code, but it is a completely different thing. Like apples and airplanes.

Unions are a very, very low level construct that allows you to see a piece of memory as if storing any of its "members", but you only can use one at a time. Even the use of the word "member" is extremely misleading. They should be called "views" or something, not members.

When you write:

union ABCunion
{
    int a;
    double b;
    char c;
} myAbc;

You are saying: "take a piece of memory big enough for the biggest among an int, a char and a double, and lets call it myAbc.

In that memory, now you can store either an int, or a double, or a char. If you store an int, and then store a double, the int is gone forever.

What's the point then?

There are two major uses for Unions.

a) Discriminated storage

That's what we did above. I pick a piece of memory and I give it different meanings depending on context. Sometimes the context is explicit (you keep some variable that indicates what "kind" of variable you stored), and sometimes it can be implicit (based of the section of code, you can tell which one must be in use). Either way, the code needs to be able to figure it out, or you won't be able to do anything sensible with the variable.

A typical (explicit) example would be:

struct MyVariantType
{
    int typeIndicator ;  // type=1 -> It's an int, 
                         // type=2 -> It's a  double, 
                         // type=3 -> It's a  char
    ABCunion body;
};

For example, VB6's "Variants" are Unions not unlike the above (but more complex).

b) Split representation This is sometimes useful when you need to be able to see a variable as either a "whole" or as a combination of parts. It's easier to explain with an example:

union DOUBLEBYTE
{
    struct
    {
        unsigned char a;
        unsigned char b;
    } bytes;
    short Integer;        
} myVar;

Here's a short int "unioned" with a pair of bytes. Now, you can view the same value as either a short int (myVar.Integer), or you can just as easily study the individual bytes that make part of the value (myVar.bytes.a and myVar.bytes.b).

Note that this second use is not portable (I'm pretty sure); meaning that it's not guaranteed to work across different machine architectures; but this use is absolutely essential for the kind of tasks for which C was designed (OS implementation).

Euro Micelli
Apples and airplanes don't look *anything* like each other :-)
paxdiablo
No, but the words kind of do, they both have A..pl..es, so the analogy is quite good really.
Eclipse