views:

1475

answers:

4

I have a C++ tool that walks the call stack at one point. In the code, it first gets a copy of the live CPU registers (via RtlCaptureContext()), then uses a few "#ifdef ..." blocks to save the CPU-specific register names into stackframe.AddrPC.Offset, ...AddrStack..., and ...AddrFrame...; also, for each of the 3 Addr... members above, it sets stackframe.Addr....Mode = AddrModeFlat. (This was borrowed from some example code I came across a while back.)

With an x86 binary, this works great. With an x64 binary, though, StackWalk64() passes back bogus addresses. (The first time the API is called, the only blatantly bogus address value appears in AddrReturn ( == 0xFFFFFFFF'FFFFFFFE -- aka StackWalk64()'s 3rd arg, the pseudo-handle returned by GetCurrentThread()). If the API is called a second time, however, all Addr... variables receive bogus addresses.) This happens regardless of how AddrFrame is set:

  • using either of the recommended x64 "base/frame pointer" CPU registers: rbp (= 0xf), or rdi (= 0x0)
  • using rsp (didn't expect it to work, but tried it anyway)
  • setting AddrPC and AddrStack normally, but leaving AddrFrame zeroed out (seen in other example code)
  • zeroing out all Addr... values, to let StackWalk64() fill them in from the passed-in CPU-register context (seen in other example code)

FWIW, the physical stack buffer's contents are also different on x64 vs. x86 (after accounting for different pointer widths & stack buffer locations, of course). Regardless of the reason, StackWalk64() should still be able to walk the call stack correctly -- heck, the debugger is still able to walk the call stack, and it appears to use StackWalk64() itself behind the scenes. The oddity there is that the (correct) call stack reported by the debugger contains base-address & return-address pointer values whose constituent bytes don't actually exist in the stack buffer (below or above the current stack pointer).

(FWIW #2: Given the stack-buffer strangeness above, I did try disabling ASLR (/dynamicbase:no) to see if it made a difference, but the binary still exhibited the same behavior.)

So. Any ideas why this would work fine on x86, but have problems on x64? Any suggestions on how to fix it?

+2  A: 

Given that fs.sf is a STACKFRAME64 structure, you need to initialize it like this before passing it to StackWalk64: (c is a CONTEXT structure)

  DWORD machine = IMAGE_FILE_MACHINE_AMD64;
  RtlCaptureContext (&c);
  fs.sf.AddrPC.Offset = c.Rip;
  fs.sf.AddrFrame.Offset = c.Rsp;
  fs.sf.AddrStack.Offset = c.Rsp;
  fs.sf.AddrPC.Mode = AddrModeFlat;
  fs.sf.AddrFrame.Mode = AddrModeFlat;
  fs.sf.AddrStack.Mode = AddrModeFlat;

This code is taken from ACE (Adaptive Communications Environment), adapted from the StackWalker project on CodeProject.

Adam Mitz
Yup, I tried stackframe.AddrFrame.Offset = context.Rsp, but it didn't help. (See 2nd bullet point above.) Which _is_ a little odd, since the ASM code itself uses rsp as a base pointer w.r.t. function-local variables -- i.e., "mov dword ptr [rsp+54h],eax" when saving StackWalk64()'s return value.
Quasidart
I've seen this code work so I know it's good. Can you post more of your code -- for example, your call to StackWalk64() or a complete example?
Adam Mitz
The code's essentially the same as other code that does work. At this point, I think it came from compiler options (cl.exe flags) -- I saw two build environments, both using the same compiler version (but diff. flags), produce binaries whose runtime [call] stack buffers looked extremely different.
Quasidart
Callstacks produced by StackWalk will often look very different based on many external factors: which dbghelp is available, the type of symbols/debugging information, etc.
Adam Mitz
+1  A: 

FWIW, I've switched to using CaptureStackBackTrace(), and now it works just fine.

Quasidart
This function is terrific.
C Johnson
A: 

@Quasidart:

I am sorry to active this thread of yours, but I want to read the current call stack and for that I need to use StackWalk64(). I have already spent a lot of time on that. But then I moved to using CaptureStackBackTrace() but the issue is this always returns 0 (no of frames) in all the cases. I am not sure why is this happening. Ideally at that point call stack should not be zero. I know I am missing something, do you know what could be the reason ?

arb
A: 

@Quasidart:

this is what I am doing, could you please check if something is wrong as its not even able to reach any of the print statements.

HANDLE hProc = GetCurrentProcess();

HANDLE hThread = GetCurrentThread();
STACKFRAME64 stackframe;
CONTEXT context;
memset (&context, 0, sizeof(context));

// Initialise hte current context from registers:
__asm pop eax
__asm mov context.Eip, eax
__asm mov context.Ebp, ebp
__asm mov context.Esp, esp

context.ContextFlags = CONTEXT_FULL;
//GetThreadContext(hThread, &context);
memset (&stackframe, 0, sizeof(stackframe));

stackframe.AddrPC.Offset         = context.Eip;
stackframe.AddrPC.Mode           = AddrModeFlat;
stackframe.AddrStack.Offset      = context.Esp;
stackframe.AddrStack.Mode        = AddrModeFlat;
stackframe.AddrFrame.Offset      = context.Ebp;
stackframe.AddrFrame.Mode        = AddrModeFlat;

if(
    StackWalk64(0x014c, 
            hProc, 
            hThread, 
            &stackframe, 
            &context, 
            NULL, 
            NULL, 
            NULL, 
            NULL)
    )
    _Print("working ");
else
    _Print("not working");

Thanks in advance. -- arb

arb