ansaurus

Question

Answer 1

+1 A:

Is this a debug or release build? I'd expect some padding with the debug builds for detecting Stack Overflows.

AShelly 2009-08-03 19:29:26

I'm not sure what you mean, I didn't use a debug flag during compilation and I just used gcc via command line. Compiled with `gcc stack.c -o stack`.

Carson Myers 2009-08-03 19:33:02

GCC may still be including it stack-smashing protection. Try compiling with the -fno-stack-protector switch and see if you still get the weird values.

Tyler McHenry 2009-08-03 19:35:12

the padding is still there. I suspect it has to do with stack alignment, but I didn't think it did THAT much alignment.

Carson Myers 2009-08-03 19:48:51

actually, it looks like it wants to put the parameters starting at a 0x......0 address, and the padding does just that

Carson Myers 2009-08-03 19:59:10

Answer 2

+6 A:

Inspecting the stack like this seems like one step too far away. I might suggest loading your program in a debugger, switching to the assembly language view, and single stepping through every machine instruction. Understanding of the CPU stack necessarily requires an understanding of the machine instructions operating on it, and this will be a more direct way to see what's going on.

As others mentioned, the structure of the stack is also highly dependent on the processor architecture you're working with.

Greg Hewgill 2009-08-03 19:30:20

well I do (mostly) understand assembly and how the stack is set up--I was just wondering what the extra data might be *for*, since examining the assembly would mostly just let me look at the instructions that put it there. And I'm working on x86 on my laptop which is usually the case for PC's (as far as I know)

Carson Myers 2009-08-03 19:35:46

If you can identify the instructions that put the data there, then we will have a much better chance of being able to tell you what they're for. Sometimes the data on the stack is just up to the whim of the compiler, and the data won't make sense without the context of the code itself.

Greg Hewgill 2009-08-03 19:38:25

well for those three values on top of 0xBBBB1111 I can find no reference to them in the assembly. The 0x0 values it seems are for stack alignment, and the rest (aside from a few code segment references) I seem to have figured out... But still there are a few values I'm not sure about

Carson Myers 2009-08-03 19:57:02

Show us the code!

Greg Hewgill 2009-08-03 20:01:25

I added the assembly, but it's pretty straight-forward

Carson Myers 2009-08-03 20:11:05

Thanks. It's hard to say for sure, but `subl $24, %esp` is where most of that space above 0xBBBB1111 is going (it's uninitalised so just happens to contain whatever was there before). The compiler doesn't generate any code to put anything there (yet), so it may be a scratch area for exception handling or something. Each of your functions seems to allocate about the same minimum amount of space, which means it may be something common to all function definitions.

Greg Hewgill 2009-08-03 20:24:16

Looks like laalto has it. I didn't know gcc had a default alignment option! Learn something new every day.

Greg Hewgill 2009-08-03 20:34:52

Answer 3

+5 A:

Most likely those are stack canaries. Your compiler adds code to push additional data to the stack and read it back afterwards to detect stack overflows.

rpetrich 2009-08-03 19:32:56

I thought that's what they might be, but I thought those were for security -- also, when I was trying to over-write the return address for a function earlier today, I had to play with the numbers and lengths of strings to get it right and probably wrote over those many times, and all that happened was that it would have a segfault, work, or just return normally

Carson Myers 2009-08-03 19:37:17

Answer 4

+5 A:

I'm guessing those values starting with 0x0804 are addresses in your program's code segement (like return addresses for function calls). The ones starting with 0xBF814 that you've labeled as return addresses are addresses on the stack -- data, not code. I'm guessing they're probably frame pointers.

Nick Meyer 2009-08-03 19:35:26

you're right, I printed the addresses of the functions and they began with 0x0804. The 0xBF814 are probably frame pointers as you said (which might explain why they all point to each other?)

Carson Myers 2009-08-03 19:40:12

Answer 5

+2 A:

The 0xBF... addresses will be links to the previous stack frame:

0xBF8144D8 : BF8144F8 //return address for trace
0xBF8144DC : 0804845A //

0xBF8144F8 : BF814518 //return address for func3
0xBF8144FC : 08048431 //????

0xBF814518 : BF814538 //return address for func2?
0xBF81451C : 0804840F //????

0xBF814538 : BF814558 //return address for func1
0xBF81453C : 080483E8 //????

The 0x08... addresses will be the addresses of the code to return to in each case.

I can't speak for the other stuff on the stack; you would have to step through the assembly language and see exactly what it is doing. I guess that it is aligning the start of each frame to a specific alignment so that __attribute__((align)) (or whatever it's called these days...) works.

brone 2009-08-03 19:41:36

Answer 6

+1 A:

The compiler uses EBP to store the frame's base address. It's been a while so I looked at this, so I may get the details a bit wrong, but the idea is like this.

You have three steps when calling a function:

The caller pushes the function's parameters onto the stack.
The caller uses the call instruction, which pushes the return address onto the stack, and jumps to the new function.
The called function pushes EBP onto the stack, and copies ESP into EBP:
(Note: well behaved functions will also push all the GPRs onto the stack with PUSHAD)

push EBP
mov EBP, ESP

When the function returns it:

pops EBP
executes the ret instruction, which pops off the return address and jumps there.

pop EBP
ret

The question is, why is EBP pushed, and why does ESP get copied into it?

When you enter the function ESP points to the lowest point on the stack for this function. Any variables you declare on the stack can be accessed as [ESP + offset_to_variable]. This is easy! But note that ESP must always point to the top of the stack, so when you declare a new variable on the stack, ESP changes. Now [ESP + offset_to_variable] isn't so great, because you have to remember what ESP was at the time the variable was allocated.

Instead of doing that, the first thing the function needs to do is to copy ESP into EBP. EBP won't change during the life of the function, so you can access all variables using `[EBP + offset_to_variable]. But now you have another problem, because if the called functions calls another function, EBP will be overwritten. That's why before copying EBP it needs to be saved onto the stack, so that it can be restored before the returning to the calling function.

Nathan Fellman 2009-08-03 19:47:58

Thanks for the description

Carson Myers 2009-08-03 20:13:30

Answer 7

+3 A:

As already pointed out, the 0xBF... are frame pointers and 0x08... return addresses.

The padding is due to alignment issues. Other unrecognized values are also padding as the stack is not initalized to zero or any other value. Uninitialized variables and unused padding space will contain whatever bytes are in those memory locations.

laalto 2009-08-03 19:53:35

ahh, that helps, I figured the padding would be set to zero

Carson Myers 2009-08-03 19:58:09

ansaurus

tags:

views:

answers:

why is the call stack set up like this?

related questions