views:

652

answers:

4

A small utility of mine that I made for personal use (written in C++) crashed randomly yesterday (I've used it roughly 100+ hours with no issues so far) and while I don't normally do this, I was feeling a bit adventurous and wanted to try and learn more about the problem. I decided to go into the Event Viewer and see what Windows had logged about the crash:

Faulting application StraightToM.exe, version 0.0.0.0, time stamp 0x4a873d19, faulting module StraightToM.exe, version 0.0.0.0, time stamp 0x4a873d19, exception code 0xc0000005, fault offset 0x0002d160, process id 0x17b4, application start time 0x01ca238d9e6b48b9.

My question is, what do each of these things mean, and how would I use these to debug my program? Here's what I know so far: exception code describes the error, and 0xc0000005 is a memory access violation (tried to access memory it didn't own). I'm specifically interested in knowing more about the following:

  1. What does the fault offset mean? Does that represent the location in the file where the error occured, or does it mean the assembly 'line' where the error occured? Knowing the fault offset, how would I use a program like OllyDbg to find the corresponding assembly code that caused the error? Or -- even better -- would it be possible to (easily) determine what line of code in the C++ source caused this error?
  2. It's obvious that the time stamp corresponds to the 32-bit UNIX time at the time of the crash, but what does the 64-bit application start time mean? Why would it be 64-bits if the time stamp is 32?

Note that I'm primarily a C++ programmer, so while I know something about assembly, my knowledge of it is very limited. Additionally, this really isn't a serious problem that needs fixing (and is also not easily reproduced, given the nature of the program), I'm just using this more as an excuse to learn more about what these error messages mean. Most of the information about these crash logs that I've found online are usually aimed at the end-user, so they haven't helped me (as the programmer) very much.

Thanks in advance

+2  A: 

There isn't much you're going to be able to do postmortem with this information.

The useful bit of information is the exception code, 0xc0000005, which in this case just means an access violation. So you dereferenced null or some other bit of memory you didn't own.

Fault offset, I suspect, is the offset from where your DLL was loaded into memory, so you could in theory add it to your base address and find the offending code, but I'm not sure.

Your best bet for debugging this is to catch it in the debugger the next time this happens. You can use Image File Execution Options to run your app automatically in the debugger. Make sure you have symbols ready (consider building DEBUG if you're currently using RELEASE).

jeffamaphone
+3  A: 

The 64-bit time stamp is the time application's primary thread was created in 100-nanosecond intervals since January 1, 1601 (UTC) (this is known as FILETIME). The 32-bit timestamp is indeed in time_t format (it tells the time the module was created and is stored in the module's header).

I'd say 0x0002d160 is an offset from the module's load address (it seems too low for an absolute address). Fire up Visual Studio, start the debugger, take a look at the "modules" debug window. Your exe file should be listed there. Find the address where the module is loaded, add 0x0002d160 to that address and take a look at the disassembly at the resulting address. Visual Studio shows source code intermixed with the assembly, you should have no problem figuring out what source line caused the problem.

avakar
Ah ok, I've heard of `FILETIME` before, I don't know why I didn't put the pieces together in this instance. As for looking through the modules in Visual Studio, that sounds like a sure fire way to do it -- the only problem is, I built this program using MinGW. I have MSVC and could rebuild the .exe using it, but my guess is that the original fault offset (generated with the MinGW-built exe) wouldn't correspond to the MSVC-built exe, right?
GRB
I see. No the offsets wouldn't correspond. I don't know what kind of debug info MinGW generates, but I'd certainly start by looking at `objdump` utility, it might be able to identify the culprit routine.
avakar
objdump helped me a lot. After an hour or so of experimenting with different settings and (MinGW) debug builds, it appears that the fault occured with an internal library function that I was using. While there's no way to know this for sure, this suggests that it was a fault in the library and not my code.
GRB
Just because a crash happens in a a library doesn't mean it is not your code at fault. If I pass an invalid pointer to strlen it may be possible to cause a crash; this would not be the fault of the library author, but my fault.
Stephen Nutt
You're absolutely right -- it's impossible to know either way without information like the stack trace or variable values at the time of crash. That said, given the nature of the code and how the error was (apparently) caused by the dereferencing of an internal library pointer (i.e. a variable my code never had access to nor set at any point) I am confident in believing my code is not at fault (though we can never know for sure).
GRB
+3  A: 

Debugging god John Robbins built a little tool called CrashFinder to help with situations like this: http://www.wintellect.com/CS/blogs/jrobbins/archive/2006/04/19/crashfinder-returns.aspx

It's always a good idea to save PDBs for every build you release to the public (this sounds like a tool you only use in private, but it might be a good idea to keep the PDB symbols around for the latest build).

Kim Gräsman
CrashFinder seemed very promising, but unfortunately didn't seem to work. The original fault address and the fault address + module start address both yielded little information. Even after building a test app with an obvious error (`*(int*)0 = 1;`), using the fault address given with CrashFinder on that app didn't work. May have something to do with MinGW programs vs. MSVC built programs.
GRB
Ah, I see. I don't know if MinGW can generate PDB symbols -- I suspect CrashFinder needs those to locate the problem location.
Kim Gräsman
Apparently not. You would definitely need extra information to find back a source line given a binary. PDBs provide that information, as far as it's even possible. But modern compilers sometimes make it impossible: if two functions map to the same instructions, and you don't compare their function pointers, they can share the same address. If a crash happens there, you'd need to know the caller to determine which of the two was called. Also, after inlining and optimizing, caller and callee may be mixed beyond even C++'s statement level.
MSalters
Some debuggers fall back on PE exports to give a vague idea of where the problem is. That really only works for DLLs, however. Good info on optimizations, I didn't know compilers could eliminate equal but unrelated code. I suppose that's one of the benefits of link-time code generation, that there's a higher-level view of optimization opportunities. Thanks!
Kim Gräsman
MinGW/GCC has a similar feature that I figured out when using `objdump`. I rebuilt my binary with debugging information enabled in my GCC settings and then used `objdump` on that .exe with the --source setting. That allowed `objdump` to intermix source lines/line numbers into the assembly dump, so I could see what C++ source code the faulting assembly line corresponded to. So, yes, MinGW/GCC has its own way of doing this.
GRB
A: 

0xc0000005 - access denied...

Arabcoder