views:

146

answers:

6

The following code snippet:

void (*foo)();
char X[1];
char Y[10];

could intruitively give me one possible stack layout of:

|  Y[10]  |
|---------|
|  X[1]   |
|---------|
|  foo    |
|---------|

I examined this by generating the ASM file using:

gcc -S -o stack stack.c

Then I oberved that the order of pushing these variables is different. So, if I accidentally did a X[1] I was expecting to address Y[0] but in the actual layout, writing something into X[1] overwrites the first byte of the memory location allocated to foo. Is the reorganization a compiler optimization step or can someone tell me why this is happening?

+1  A: 

Likely speculation: the compiler will attempt to place the char arrays next to each other to minimize the total amount of padding that's inserted.

Generally, the CPU is happiest retrieving multi-byte data on some "whole" bit alignment, which almost always corresponds to the bit-width of the machine. So a 32-byte int will be aligned to a 32-bit boundary. To make that happen, the compiler will "pad" the stack with bytes that are never accessed.

However, there's no benefit to such alignment when you're retrieving a byte at a time.

Anon
+1  A: 
|   var   | address |
|---------|---------|
|  Y[10]  |  x      |
|---------|---------|
|  X[1]   |  x + 10 |
|---------|---------|
|  foo    |  x + 11 |
|---------|---------|

stack grows to lower addresses, so if you access to next address (higher address), like next element of array you access memory at bigger address. So X[1] = *(x + 10 + 1) = foo

Andrey
The stack on *some* (most of the popular ones just now) architectures grow towards lower address.
dmckee
@dmckee yes. this answer is valid in that case, this is also the case that question is about
Andrey
+4  A: 

Why do you say "should"?

Of course, your suggested stack layout would be the result of one particular--very obvious--way of implementing automatic variables, but there is nothing that requires it.

Thus, no "should".


To force the order of some items in memory so that you can play (behavior unspecified, totally unsafe and unportable!) games with overwriting, use a struct and your compiler's padding #pragmas.

dmckee
this is standard de facto. Automatic variables are stored in stack.
Andrey
@dmckee: Changed my sentence. Thanks for pointing it out.
Legend
@Audrey: Absolutely. But there is no requirement that room be made for them as they are encountered in the code. The compiler could just as easily pile them up in a symbol table until it *had* to make room for them, then emit code reserving space in alphabetic order by name or any other logic that struck the author's fancy.
dmckee
+1  A: 

This is because stack grows down on most architectures.

Nikolai N Fetissov
i always thought that DOWN means higher addresses
Andrey
That depends on how you look at your memory :) In this case "down" means lower addresses.
Nikolai N Fetissov
+1  A: 

The stack grows down on most platforms, but why depend on it? Compiler optimizations might align the variables to 4 byte boundaries also. Why not do this?

char x[11];
char *y = &x[1];
Jason Goemaat
+2  A: 

Even with no optimization the ordering of variables in memory is usually not something that you can count on. The ordering that they do end up with depends on how you look at them, anyway. If you saw a group of people standing in a row ordered from shortest to tallest another person may say that the are actually ordered from tallest to shortest.

The first thing that effects the order in which these variables are in memory is just how the compiler is implemented. It has a list of things and list can be processed from either beginning to end or end to beginning. So the compiler reads your code, produces intermediate code, and this intermediate code has a list of local variables that need to be put on the stack. The compiler doesn't really care what order they were in the code, so it just looks at them in whatever order is most convenient.

The second thing is that many processors use an upside down stack. If you:

push A
push B

Then A has a larger address than B, even though B is at the top of the stack (and on top of A). A good way to imagine this is using a C array:

int stk[BIG];
int stk_top = BIG;

and then

 void stk_push(int x) {
     stk_top--;
     stk[stk_top] = x;
 }

As you can see the stk_top index actually shrinks as the stack gets more items on it.

Now, back to optimization -- the compiler is pretty free when it comes to reordering things that aren't in structs. This means that your compiler may very well reorder the local variables on the stack, as well as add extra padding bytes in there to keep things aligned. Additionally, the compiler is also free to not even put some local variables on the stack. Just because you name a local variable doesn't mean that the compiler has to really generate it in the program. If a variable is not actually used it may be left out of the program. If a variable is used a lot it may be kept in a register. If a variable is only used for part of a program then it may only actually exist temporarily and the memory that it did use can be shared among several other temporary variables during the function.

nategoose
Compilers are actually also free to optimize structs (they might not reorder things, but they certainly pad them). They are all obsessed with aligning things. So you can easily get surprised when you do sizeof(mystruct) :)
Jaka