How to get a call stack backtrace? (deeply embedded, no library support)

views:

392

answers:

+2 Q:

How to get a call stack backtrace? (deeply embedded, no library support)

+2 A:

gcc does return optimization. In func1() and func2() it does not call func2()/func3() - instead of this, it jumps to func2()/func3(), so func3() can return immediately to main().

In your case, func1() and func2() do not need to setup a stack frame, but if they would do (e.g. for local variables), gcc still can do the optimization if the function call is the last instruction - it then cleans up the stack before the jump to func3().

Have a look at the generated assembler code to see it.

Edit/Update:

To verify that this is the reason, do something after the function call, that cannot be reordered by the compiler (e.g. using a return value). Or just try compiling with -O0.

IanH 2010-08-03 16:40:25

Downvoted, he said he checked the assembler.

DeadMG 2010-08-03 16:41:37

He says the functions are there (not inlined), but he did not say if he has checked if the functions are called or jumped to.

IanH 2010-08-03 16:43:24

@DeadMG: The downvote is certainly harsh. Tail calls are usually optimised like this when compiling for ARM, and this optimisation would give exactly the observed results.

Mike Seymour 2010-08-03 16:51:08

The OP specifically said he checked the disassembler.

DeadMG 2010-08-03 19:10:20

@DeadMG: He said that he checked that the functions were called rather than inlined, but he may have missed the functions ending with a branch rather than a return. It's not something you'd notice unless you carefully read every instruction. Of course, your votes are yours to deal out as you see fit.

Mike Seymour 2010-08-03 20:36:46

@DeadMG: Even with a a look at the disassembly, if you don't know about this optimization you easily can oversee if there is a call or jump.I still think this is the problem here - the other answer is interesting, but it does not explain why there is only func3() and main() in the backtrace. (and not func3() and func2() only).

IanH 2010-08-03 20:40:08

To clarify: the simplified toy code in the original post could have done return call/jump optimization, but in the actual code, there are things on both sides of the call that could not (and I have verified that they are not) being optimized away. There is a push/pop at the start and end of each function, and the next function in the chain is called with a blx instruction (Thumb2).

hugov 2010-08-03 22:04:56

+3 A:

Since ARM platforms do not use a frame pointer, you never quite know how big the stackframe is and cannot simply roll out the stack beyond the single return value in R14.

When investigating a crash for which we do not have debug symbols, we simply dump the whole stack and lookup the closest symbol to each item in the instruction range. It does generate a load of false positives but can still be very useful for investigating crashes.

If you are running pure ELF executables, you can separate debug symbols out of your release executable. gdb can then help you find out what is going on from your standard unix core dump

doron 2010-08-03 16:57:58

+1 we've done something similar on MIPS

bstpierre 2010-08-03 17:20:30

You could reduce the false positives by using the disassembled executable to manually reconstruct the stack frames; look at the first few instructions of each function to count the stacked registers, and any further adjustments to the stack pointer.

Mike Seymour 2010-08-03 17:35:52

Nitpick: some ARM platforms do use a frame pointer (usually `r11`). But that's not important here, since the questioner states that his platform doesn't.

Mike Seymour 2010-08-03 17:37:41

Mike: yes I could do that (myself)... but surely there is some code or library I can leverage that already does it?!Surely in the context of exceptions, every possible stack frame has to contain the necessary metadata (at a minimum, the size) to unwind up the stack. Thus, given exception handling works, why can't gcc's own unwinder do this for me?

hugov 2010-08-03 22:08:09

@hugov: exception handling needs to know which objects to destroy, where to jump to, and what state to restore the stack to. It doesn't need to know the complete call stack, so I wouldn't expect to be able to reconstruct a complete stack trace unless the compiler specifically chooses to support this. From your experience, I'm guessing it doesn't, but I could be wrong.

Mike Seymour 2010-08-04 11:50:10

@Mike Seymour - Technically ARM assembler does not even have the concept of a stack built into it. The closest we come is the LDM and STM instructions. So you are free to implement a stack any way you like. The ARM Procedure Call which is used for most standard ARM ABIs does not support a frame pointer but there is nothing other than compatibility that will stop you from using a frame pointer.

doron 2010-08-04 14:43:17

@deus: Indeed, although Thumb has `push` and `pop` instructions which assume a full-descending stack with `r13` as the stack pointer, so the concept of a stack has slipped into assembly there. The current ABI doesn't have a concept of a frame pointer, but older ones had variants that did, to allow unwinding in the days when debugging information couldn't be relied on for that.

Mike Seymour 2010-08-04 15:04:26

@Mike, see updated OP. Very curious!

hugov 2010-08-04 20:24:54

Does your executable contain debugging information, from compiling with the -g option? I think this is required to get a full stack trace without a frame pointer.

You might need -gdwarf-2 to make sure it uses a format that includes unwind information.

Mike Seymour 2010-08-04 11:54:42

Possible, although I'm pretty sure (like 99.9%) that the DWARF info doesn't actually make it into the binary image programmed into flash. How would I check?

hugov 2010-08-04 20:22:35

ansaurus

tags:

views:

answers:

How to get a call stack backtrace? (deeply embedded, no library support)

related questions