views:

152

answers:

3

My application uses GLUTesselator to tesselate complex concave polygons. It randomly crashes when I run the plain release exe, but it never crashes if I do start debugging in VS. I found this right here which is basically my problem:

The multi-thread debug CRT (/MTd) masks the problem, because, like

Windows does with processes spawned by a debugger, it provides to your program a debug heap, that is initialized to the 0xCD pattern. Probably somewhere you use some uninitialized area of memory from the heap as a pointer and you dereference it; with the two debug heaps you get away with it for some reason (maybe because at address 0xbaadf00d and 0xcdcdcdcd there's valid allocated memory), but with the "normal" heap (which is often initialized to 0) you get an access violation, because you dereference a NULL pointer.

The problem is the crash occurs in GLU32.dll and I have no way to find out why its trying to dereference a null pointer sometimes. it seems to do this when my polygons get fairly large and have lots of points. What can I do?

Thanks

+2  A: 

It's a fact of life that sometimes programs behave differently in the debugger. In your case, some memory is initialized differently, and it's probably laid out differently as well. Another common case in concurrent programs is that the timing is different, and race conditions often happen less often in a debugger.

You could try to manually initialize the heap to a different value (or see if there is an option for this in Visual Studio). Usually initializing to nonzero catches more bugs, but that may not be the case in your situation. You could also try to play with your program's memory mapping to arrange that the page 0xcdcdc000 is unmapped.

Visual Studio can set a breakpoint on accesses to a particular memory address, you could try this (it may slow your program significantly more than a variable breakpoint).

Gilles
+2  A: 

but it never crashes if I do start debugging in VS.

Well, I'm not sure exactly why but while debugging in visual studio program sometimes can get away with accessing some memory regions that would crash it without debugger. I do not know exact reasons, though, but sometimes 0xcdcdcdcd and 0xbaadfood doesn't have anything to do with that. It is just accessing certain addresses doesn't cause problems. When this happens, you'll need to find alternative methods of guessing the problem.

What can I do?

Possible solutions:

  1. Install exception handler in your program (_set_se_translator, if I remember correctly). On access violation try MinidumpWriteDump. Debug it later using Visual Studio (afaik, crash dump debugging is n/a in express edition), or using windbg.
  2. Use just-in-time debuggers. Non-express edition of visual studio have this feature. There are probably alternatives.
  3. Write custom memory manager (that'll override new/delete and will provide malloc/free alternatives (if you use them)) that will grab large chunk of memory, lock all unused memory with VirtualProtect. In this case all invalid access will cause crashes even in debug mode. You'll need a lot of memory for such memory manager, because to be locked, each block should be aligned to pages.
  4. Add excessive logging to all suspicious function calls. Dump a lot of text/debug information into file (or stderr) - parameter values, arrays, everything you suspect could be related to crash, flush after every write to file, otherwise some info will be lost during the crash. This way you'll be able to guess what happened before program crashed.
  5. Try debugging release build. You should be able to do it to some extent if you enable "debug information" for release build in project settings.
  6. Try switching on/off "basic runtime checks" and "buffer security check" in project properties (configuration properties->c/c++->code genration).
  7. Try to find some kind of external tool - something like valgrind or bounds checker. Although, to my expereinece, #3 is more reliable than that approach. Although that really depends on the problem.
SigTerm
+2  A: 

A link to an earlier question and two thoughts.

First off you may want to look at a previous question about valgrind substitutes for windows. Lots of good hints on programs that will help you.

Now the thoughts:

1) The debugger may stop your program from crashing in the code you're testing, but it's not fixing the problem. At worst you're just kicking the can down the street, there's still corruption but it's not evident from the way you're running. When you ship you can be assured someone will run into the problem again.

2) What often happens in cases like this is that the error isn't near where the problem occurs. While you may be noticing the problem in GLU32.dll, there was probably corruption earlier, maybe even in a different thread or function, which didn't cause a problem and at some later point the program came back to the corrupted region and failed.

Paul Rubel
+1, nice answer. Valgrind is awesome for this kind of problem. I've never used one of the Windows substitutes for it, but if any of them are as good as valgrind, they'll identify his problem in a snap.
Head Geek