I have a C++ tool that walks the call stack at one point. In the code, it first gets a copy of the live CPU registers (via RtlCaptureContext()), then uses a few "#ifdef ...
" blocks to save the CPU-specific register names into stackframe.AddrPC.Offset
, ...AddrStack
..., and ...AddrFrame
...; also, for each of the 3 Addr
... members above, it sets stackframe.Addr
....Mode = AddrModeFlat
. (This was borrowed from some example code I came across a while back.)
With an x86 binary, this works great. With an x64 binary, though, StackWalk64() passes back bogus addresses. (The first time the API is called, the only blatantly bogus address value appears in AddrReturn
( == 0xFFFFFFFF'FFFFFFFE
-- aka StackWalk64()'s 3rd arg, the pseudo-handle returned by GetCurrentThread()). If the API is called a second time, however, all Addr
... variables receive bogus addresses.) This happens regardless of how AddrFrame
is set:
- using either of the recommended x64 "base/frame pointer" CPU registers:
rbp
(=0xf
), orrdi
(=0x0
) - using
rsp
(didn't expect it to work, but tried it anyway) - setting
AddrPC
andAddrStack
normally, but leavingAddrFrame
zeroed out (seen in other example code) - zeroing out all
Addr
... values, to let StackWalk64() fill them in from the passed-in CPU-register context (seen in other example code)
FWIW, the physical stack buffer's contents are also different on x64 vs. x86 (after accounting for different pointer widths & stack buffer locations, of course). Regardless of the reason, StackWalk64() should still be able to walk the call stack correctly -- heck, the debugger is still able to walk the call stack, and it appears to use StackWalk64() itself behind the scenes. The oddity there is that the (correct) call stack reported by the debugger contains base-address & return-address pointer values whose constituent bytes don't actually exist in the stack buffer (below or above the current stack pointer).
(FWIW #2: Given the stack-buffer strangeness above, I did try disabling ASLR (/dynamicbase:no
) to see if it made a difference, but the binary still exhibited the same behavior.)
So. Any ideas why this would work fine on x86, but have problems on x64? Any suggestions on how to fix it?