views:

77

answers:

5

Hello, I am programming a game since 2 years ago. sometimes some memory errors (ie: a function returning junk instead of what it was supposed to return, or a crash that only happen on Linux, and never happen with GDB or Windows) happen seemly at random. That is, I try to fix it, and some months later the same errors return to haunt me.

There are a software (not Valgrind, I already tried it... it does not find the errors) that can help me with that problem? Or a method of solving these errors? I want to fix them permanently.

+3  A: 

On Windows, you can automatically capture a crashing exception in a production environment and analyze it as if the error occurred on your developer PC under the debugger. This is done using a "mini-dump" file. You basically use the Windows "dbghelp.dll" DLL to generate a copy of the thread stacks, parts or all of the heap, the register values, the loaded modules, and the unhandled exception that resulted in the crash. You can launch this ".dmp" file in the MS Visual Studio debugger as if it were an executable and it will show you exactly where the crash occurred.

You can set up a trap for unhandled exceptions and delegate the creation of the mini-dump file to dbghelp.dll in that trap. You need to keep the ".pdb" files that were generated with the deployed binaries to match up memory addresses with source code locations for a better debugging experience. This topic is too deep to fully cover See Microsoft's documentation on this DLL.

You do need to be able to copy the .dmp file from the PC where it crashed to your development environment to fully debug it. If you have a hands-off relationship with your users you'll need to have the option of having a separate utility app "phone home" over the internet to tranfer the .dmp file to a location where you can access it. You can launch the app from the unhandled exception trap after the .dmp file has been generated. For user privacy, you should give the user the option of whether or not to do this.

David Gladfelter
Sorry, but it is a noncrashing error, maybe fortunally (or unfortunally?) the program continues running seemly fine, even with wrong values wandering around.
speeder
Oh yeah, I mentioned a crash, but like I said, it never happen on Windows. It only happen on Linux, when GDB is not active, and seemly at random (that is, when it happen, I can reproduce it for many times, but suddenly it stop happening... :/)
speeder
A: 

AFAIK, Boundscheck in Windows does a very good job. In one of my project, it caught some very weird errors.

Alphaneo
+1  A: 

The Totalview debugger (commercial software) may catch the crash.

Purify (commercial software) can help you find memory leaks.

Does your code compile free of compiler warnings? Did you run lint?

andreas buykx
A: 

To avoid this in my own projects (on Windows), I wrote my own memory allocator which simply called VirtualAlloc and VirtualFree. It allocated an extra page for each request, aligned it just to the left of the last page, and used VirtualProtect to generate an exception whenever the last page was accessed. This detected out-of-bounds accesses, even just reads, on the spot.

Disclaimer: I was by no means the first to have this idea.

For example, if pages are 4096 bytes, and new int[1] was called, the allocator would:

  1. Allocate 8192 bytes (4 bytes are needed, which is one page, and the extra guard page brings the total to 2 pages)
  2. Mark the last page unaccessible
  3. Determine the address to return (the last allocated page starts at 4096... 4096 - 2 = 4092)

The following code:

main() {
    int *array = new int[10];
    return array[10];
}

would then generate an access violation on the spot.

It also had a (compile-time) option to detect accesses beyond the left side of the allocation (ie, array[-1]), but these kinds of errors seemed rare, so I never used the option.

zildjohn01
+1  A: 

One thing you could try is using the Hans Boehm GC with your project. It can be used as a leak detector, allowing you to remove suspicious-looking free() or delete statements and easily see whether they cause memory leaks.

dsimcha