tags:

views:

523

answers:

3

I have an application that I'm trying to debug a crash in. However, it is difficult to detect the problem for a few reasons:

  • The crash happens at shutdown, meaning the offending code isn't on the stack
  • The crash only happens in release builds, meaning symbols aren't available

By crash, I mean the following exception:

0xC0000005: Access violation reading location 0x00000000.

What strategy would you use to diagnose this problem?

What I have done so far is remove as much code from my program until I get the bare minimum that will cause the crash. It seems to be happening in code that is statically linked to the project, so that doesn't help, either.

+1  A: 

You seem to have something reading a null pointer - never good.

I'm not sure what platform you are on. Under Linux, you could consider using valgrind.

What is different about your release builds from your debug builds apart from the presence or absence of the debug information?

Can you built the statically linked code with debugging information in it? Can you obtain a debug build of the statically linked code?

Jonathan Leffler
It's windows using WTL. The staticly linked code is ours, and yes I can create a debug build of it.
FryGuy
When I clicked on the WTL tag, I found that everything was Windows related...Are you sure you can't get a crash out of the full debug build?
Jonathan Leffler
+4  A: 

You can make the symbol files even for the release build. Do that, run your program, attach the debugger, close it, and see the cause of the crash in the debugger.

Nemanja Trifunovic
The option is called "Generate debug info" in the link tab of visual c++ 6.
FryGuy
+2  A: 

The strategy I would use is exactly what you've done. Remove as much code as possible until the problem disappears then add that last bit back in and debug it.

However, it may not be your code that's at fault. One thing to watch out for - we found this problem on AIX and, even though you're running Windows, it may be similar.

We had a third party library which dynamically loaded another shared library which, in its initialization routine, set up an atexit function to be called when the process exits.

However, as our application loads and unloads these shared libraries, by the time the process exited, the shared library's atexit function was no longer in memory and we dumped core.

This shows up as an access violation after returning from main() so, if that's what's happening to you, it's almost certainly the same sort of thing. The C RTL startup code will walk the atexit list and call each of its functions, no matter what you've done with them.

Of course, if it's crashing before main() exits, then this is a moot point.

One thing you could consider (and we've actually done this on one occasion after a cost/benefit analysis of tracking down and fixing a particularly thorny bug): send out the debug release as your product. If it's not crashing, that may be a quick fix to get the product out there while you work on a more acceptable solution at your leisure.

paxdiablo
Interesting one!
Jonathan Leffler
Bare in mind that you may be breaking license agreements if you start distributing debug runtime dlls. You aren't supposed to distribute the debug versions of the MFC runtime for example
John Sibly