views:

697

answers:

11

Hi,

I am investigating a crash due to heap corruption. As this issue is non-trivial and involves analyzing the stack and dump results, I have decided to do a code review of files related to the crash.

To be frank, I don't have in-depth knowledge of when the heap could be corrupted.

I would appreciate if you could suggest scenarios which could lead to heap corruption.

Platform: Windows XP

Language: C++

Compiler: VC6

Thanks in advance

SSS

+2  A: 

Welcome to hell. There is no easy solution so I will only provide some pointers.

Try to reproduce the bug in a debug environement. Debuggers can pad your heap allocations with bound checks and will tell you if you wrote in those bound checks. Also, it will consistently allocate memory using the same virtual addresses, making reproductibility easier.

In that case, you can try an analyser tool such as Purify. They will detect pretty much anything nasty that your code is doing but will also run VERY slowly. Such a tool will detect out of bound memory access, freed memory access, trying to free twice the same block, using the wrong allocator/deallocators, etc... Those are all kind of conditions that can stay latent for very long and only crash at the most inopportune moment.

Coincoin
sadly, This crash is not easy to produce,only happens once in almost million time
sat
I understand. However, even if the bug manifests itself by a crash one in a millionth time, it might still be detectable pretty easily by an analyser tool.
Coincoin
+6  A: 

Common scenarios include:

  • Writing outside the allocated space of an array (char *stuff = new char[10]; stuff[10] = 3;)
  • Casting to the wrong type
  • Uninitialized pointers
  • Typo error for -> and .
  • Typo error when using * and & (or multiple of either)

[EDIT] From the comments, a few more:

  • Mixing new [] and new with delete [] and delete
  • Missing or incorrect copy-constructors
  • Pointer pointing to garbage
  • Calling delete multiple times on the same data
  • Polymorphic baseclasses without virtual destructors
cwap
Using pointers pointing to garbage.I had a rather insidious one.const WCHAR * cstr = ICUString.c++str().c_str();The syntax is wrong, but basically, I went from ICUString to std::wstring to const WCHAR *, but since the std::wstring is a anonymous object, it's deleted after the line's done, thus my pointer's pointing to garbage. Accessing it led to corruption and crash.
Calyth
Don't forget calling delete on two different pointers to the same object on the heap
Graphics Noob
Add to that: Passing random/already deleted pointers to `delete`, mixing `new`/`new[]` with `delete`/`delete[]`, missing or incorrect copy ctors/assignment operators, polymorphically used base classes with non-virtual dtors... This list is _very_ long. Basically it boils down to what I have summarized here: http://stackoverflow.com/questions/1346583/1346631#1346631
sbi
All great additions. Ill add them to the list :)
cwap
accessing already freed pointers to data, calling a member function on a deleted object, using invalid iterators to stl containers, ...VC has a debug heap implementation that can find some of the issues in your list.
Hans
My understanding was using the delete keyword on a pointer that was already deleted did nothing. Is there something special about the second pointer to the same object? I have Stroustup's book (2nd ed.) if you have a page reference. (Don't forget to add it to the answer too)
Kelly French
Well, just tried deleting the same data 2 times in VC++, and it crashed with an assertion exception in debug mode, which makes me think that it's not a good idead to do :) I guess you potentially delete whatever data might be reallocated into that address-space later on.
cwap
@Kelly: It's passing `NULL` pointers to `delete` which is doing nothing. Passing the same pointer to `delete` twice is a fatal error leading to what's called undefined behavior. Usually this invokes Nasty Nasal Demons on you.
sbi
+4  A: 

You can look at the sample chapter from the Advanced Windows Debugging book which provides various examples of heap corruption.

EDIT: If you are using stl containers which might move elements during changes (i.e. vector, deque), ensure that you are not keeping references into such elements across operations which might changes it.

Komat
Thanks i found the book very useful , Will read it throughout
sat
+1  A: 

Its a common mistake to free() or delete allocated memory more than one. It may help to insert something like *var = NULL after such calls, and to check for != NULL when calling free. Although in C++ its legal to call delete with a NULL variable, calling C - free() will fail.

Also a common problem is to confuse delete and delete [].

Variables allocated with new must be released with delete.

Array allocated with new [] must be released with delete[].

Also make sure not to mix C- style memory management (malloc, calloc, free) with C++ style memory management (new/delete). In legacy code often both are mixed, but things allocated with the one can not be freed with the other.

All of these errors will usually not be recognized by a compiler.

RED SOFT ADAIR
thanks StefanWoeone basic Question , if is do new[] but use delete, rather than delete, how it will lead to heap corruption , Can rest of memory re-allocated ?
sat
If something is corrupted, it makes no sense to re-allocate. You are lost.
RED SOFT ADAIR
its my bad english : ) , I mean to say if i initialize char *ptr=new char[100] , and do a delete ptr, will it free first char only..if yes what will happen to rest of 99
sat
The remaining 99 bytes will be lost meaning there will be a memory leak of 99 byes.
VNarasimhaM
What will happen in that case is explicitly undefined by the specification. Most likely result is that the destructor is called for first element in the array and entire block of the memory is released. However crash, leak or heap corruption might be also possible.
Komat
The behavior of calling "new[]" and then "delete" is not defined. It might free only the first element, crash the program, cook coffee whatever.
RED SOFT ADAIR
+5  A: 

There are products that will observe the memory allocations and deallocations, and produce a report on anomalies. Last I used one, they weren't cheap, but I don't know what the situation is right now. However, finding stuff for VC++ 6 might be a problem.

Remember that you're liable to be getting heap corruption a lot more often than you're going to crash, so be attentive to the problem reports, and fix all heap corruption.

I'd suggest using std::tr1::smart_ptr<> instead of raw pointers everywhere, but I'm not sure VC++ 6 is going to support that.

Why are you still on VC++ 6? Would it be practical to upgrade? The tools are better with the more recent versions, and they fully support smart pointers.

David Thornley
Thanks David, Your comments are valueble ,Migrating from VC++ 6 is a big decision , Its a legacy code so there a few change possible , Yes in future i would like to use smart pointer :) thanks
sat
+1  A: 

Check out the answers to this related question.

The answer I suggested provides a technique which may be able to get you back to the code that is actually causing the heap corruption. My answer describes the technique using gdb but I'm sure you must be able to do something similar on windows.

The principle at least should be the same.

Richard Corden
+1  A: 

During freeing heap memory, first child memory block need to be deleted and then parant memory block otherwise the child momory block would be leaked parmanently which causes the crash after running the tool millions of times.. ex:

constQ= new double* [num_equations];
for(int i=0;i<num_equations;i++)
{
constQ[i]=new double[num_equations];
for(int j=0;j<num_equations;j++)
{
constQ[i][j]=0.0;
}
.
.
.

//Deleting/Freeing memory block 
//Here the below only parent memory block is deleted, the child memory block is leaked.

if(constQ!=NULL)
{
delete[] constQ;
constQ=NULL
} 
//Correct way of deleting heap memory..First delet child block memory and then Parent block

if(constQ!=NULL)
{
for(int i=0; i <num_equations;i++)
{
delete[] constQ[i];
delete[] constQ;
constQ=NULL
}
Red
A: 

If you have access to a *nix machine, you can use Valgrind.

Paul Nathan
+2  A: 

Every time you do something which isn't defined in the language standard, it is undefined behavior, and one of the ways in which it might manifest itself is through heap corruption. There are about three million ways to do this in C++, so it's really impossible to say.

A few common cases are double-freeing dynamically allocated memory, or writing outside the bounds of an array. Or writing to an uninitialized pointer.

Recent versions of Microsoft's compiler add an /analyze compiler switch which performs a bunch of static analysis to catch such errors. On Linux, valgrind is an obvious choice.

Of course, you are using VC6 which has been unsupported for years, and which has a number of known bugs, resulting in invalid code being generated.

If possible, you should upgrade to a proper compiler.

jalf
+2  A: 

An additional debugging tip is to look at the values that are being written to the wrong place using the raw memory view. Is it writing zeros... strings... some other recognizable number pattern? If you can see the corrupted memory at some point after the error occurs, that can provide a hint of the code that caused it.

Also always set pointers to null after deleting them even in destructors. Sometimes unexpected things can be triggered during a destructor chain in a base class than cause an access to a partially deleted sub-class.

Alan
A: 

The most difficult memory corruption bug I've run into involved calling a function in a DLL that returned a std::vector and then letting that std::vector fall out of scope: which is basically the whole point of std::vector. Unfortunately it turned out that the DLL was linked to one version of the C++ runtime, and the program was linked to another; which meant that the library was calling one version of new[] and I was calling a completely different version of delete[] (looking for the memory in completely different memory arenas to boot).

That is not what's happening here, because that failed every time and according to one of your comments "the bug manifests itself by a crash one in a millionth time." I would guess that there's an if statement that gets taken once in a million times and it causes a double delete bug.

I recently used evaluation versions of two products that may help you: IBM's Rational Purify and Intel Parallel Inspector. I'm sure there are others (Insure++ is mentioned a lot). On Linux you would use Valgrind.

Max Lybbert
Even your `std::vector` bug might, under some circumstances, only lead to a crash once in a million times the vector passes its memory to the wrong heap for deletion. It's the nature of undefined behavior that its outcome is, well, _undefined_.
sbi