views:

3723

answers:

16

So, I need some help. I am working on a project in C++. However, I think I have somehow managed to corrupt my heap. This is based off the fact that I added a std::string to a class and assigning it a value from another std::string:

std::string hello = "Hello, world.\n";
/* exampleString = "Hello, world.\n" would work fine. */
exampleString = hello;

crashes on my system with a stack dump. So basically I need to stop and go through all my code and memory management stuff and find out where I've screwed up. The codebase is still small (about 1000 lines), so this is easily do-able.

Still, I'm over my head with this kind of stuff, so I thought I'd throw it out there. I'm on a Linux system, and have poked around with valgrind, and while not knowing completely what I'm doing, it did report that the std::string's destructor was an invalid free. I have to admit to getting the term 'Heap Corruption' from a Google search; any general purpose articles on this sort of stuff would be appreciated as well.

(In before rm -rf ProjectDir, do again in C# :D)

EDIT: I haven't made it clear, but what I'm asking for are ways an advice of diagnosing these sort of memory problems. I know the std::string stuff is right, so it's something I've done (or a bug, but there's Not A Problem With Select). I'm sure I could chuck the code I've written up and you very smart folks would see the problem in no time, but I want to add this kind of code analysis to my 'toolbox', as it were.

A: 

As far as I can tell your code is correct. Assuming exampleString is an std::string that has class scope like you describe, you ought to be able to initialize/assign it that way. Perhaps there is some other issue? Maybe a snippet of actual code would help put it in context.

Question: Is exampleString a pointer to a string object created with new?

JimDaniel
A: 

It could be heap corruption, but it's just as likely to be stack corruption. Jim's right. We really need a bit more context. Those two lines of source don't tell us much in isolation. There could be any number of things causing this (which is the real joy of C/C++).

If you're comfortable posting your code, you could even throw all of it up on a server and post a link. I'm sure you'd gets lots more advice that way (some of it undoubtedly unrelated to your question).

Derek Park
A: 

Your code as I can see has no errors. As has been said more context is needed.

If you haven't already tried, install gdb (the gcc debugger) and compile the program with -g. This will compile in debugging symbols which gdb can use. Once you have gdb installed run it with the program (gdb ). This is a useful cheatsheat for using gdb.

Set a breakpoint for the function that is producing the bug, and see what the value of exampleString is. Also do the same for whatever parameter you are passing to exampleString. This should at least tell you if the std::strings are valid.

I found the answer from this article to be a good guide about pointers.

roo
A: 

The code was simply an example of where my program was failing (it was allocated on the stack, Jim). I'm not actually looking for 'what have I done wrong', but rather 'how do I diagnose what I've done wrong'. Teach a man to fish and all that. Though looking at the question, I haven't made that clear enough. Thank goodness for the edit function. :')

Also, I actually fixed the std::string problem. How? By replacing it with a vector, compiling, then replacing the string again. It was consistently crashing there, and that fixed even though it...couldn't. There's something nasty there, and I'm not sure what. I did want to check the one time I manually allocate memory on the heap, though:

 this->map = new Area*[largestY + 1];
 for (int i = 0; i < largestY + 1; i++) {
     this->map[i] = new Area[largestX + 1];
 }

and deleting it:

for (int i = 0; i < largestY + 1; i++) {
    delete [] this->map[i];
}
delete [] this->map;

I haven't allocated a 2d array with C++ before. It seems to work.

Bernard
+5  A: 

Oh, if you want to know how to debug the problem, that's simple. First, get a dead chicken. Then, start shaking it.

Seriously, I haven't found a consistent way to track these kinds of bugs down. Because there's so many potential problems, there's not a simple checklist to go through. However, I would recommend the following:

  1. Get comfortable in a debugger.
  2. Start tromping around in the debugger to see if you can find anything that looks fishy. Check especially to see what's happening during the exampleString = hello; line.
  3. Check to make sure it's actually crashing on the exampleString = hello; line, and not when exiting some enclosing block (which could cause destructors to fire).
  4. Check any pointer magic you might be doing. Pointer arithmetic, casting, etc.
  5. Check all of your allocations and deallocations to make sure they are matched (no double-deallocations).
  6. Make sure you aren't returning any references or pointers to objects on the stack.

There are lots of other things to try, too. I'm sure some other people will chime in with ideas as well.

Derek Park
A: 

Also, I actually fixed the std::string problem. How? By replacing it with a vector, compiling, then replacing the string again. It was consistently crashing there, and that fixed even though it...couldn't. There's something nasty there, and I'm not sure what.

That sounds like you really did shake a chicken at it. If you don't know why it's working now, then it's still broken, and pretty much guaranteed to bite you again later (after you've added even more complexity).

Derek Park
A: 

@Derek:
I know! That's exactly why I want to (quote)stop(endquote) and review the hell out of the code.

Bernard
+1  A: 

Some places to start:

If you're on windows, and using visual C++6 (I hope to god nobody still uses it these days) it's implentation of std::string is not threadsafe, and can lead to this kind of thing.

Here's an article I found which explains a lot of the common causes of memory leaks and corruption.

At my previous workplace we used Compuware Boundschecker to help with this. It's commercial and very expensive, so may not be an option.

Here's a couple of free libraries which may be of some use

http://www.codeguru.com/cpp/misc/misc/memory/article.php/c3745/

http://www.codeproject.com/KB/cpp/MemLeakDetect.aspx

Hope that helps. Memory corruption is a sucky place to be in!

Orion Edwards
A: 

@Bernard - sounds like you should do a completely clean build more often. If your compiler is getting confused by what it has compiled and what has not, you should stop playing its game - turn off any incremental compilation / incremental linking options, and do clean builds often. That will reduce the mysterious/weird crash bugs a lot.

Valters Vingolds
A: 

Run Purify.

It is a near-magical tool that will report when you are clobbering memory you shouldn't be touching, leaking memory by not freeing things, double-freeing, etc.

It works at the machine code level, so you don't even have to have the source code.

One of the most enjoyable vendor conference calls I was ever on was when Purify found a memory leak in their code, and we were able to ask, "is it possible you're not freeing memory in your function foo()" and hear the astonishment in their voices.

They thought we were debugging gods but then we let them in on the secret so they could run Purify before we had to use their code. :-)

http://www-306.ibm.com/software/awdtools/purify/unix/

(It's pretty pricey but they have a free eval download)

Mark Harrison
A: 

@Mark:
Looks cool! I'll try the eval download (holy crap their website is bad!). Far too expensive for me, as this is just a personal project.

Bernard
A: 

One of the debugging techniques that I use frequently (except in cases of the most extreme weirdness) is to divide and conquer. If your program currently fails with some specific error, then divide it in half in some way and see if it still has the same error. Obviously the trick is to decide where to divide your program!

Your example as given doesn't show enough context to determine where the error might be. If anybody else were to try your example, it would work fine. So, in your program, try removing as much of the extra stuff you didn't show us and see if it works then. If so, then add the other code back in a bit at a time until it starts failing. Then, the thing you just added is probably the problem.

Note that if your program is multithreaded, then you probably have larger problems. If not, then you should be able to narrow it down in this way. Good luck!

Greg Hewgill
+8  A: 

These are relatively cheap mechanisms for possibly solving the problem:

  1. Keep an eye on my heap corruption question - I'm updating with the answers as they shake out. The first was balancing new[] and delete[], but you're already doing that.
  2. Give valgrind more of a go; it's an excellent tool, and I only wish it was available under Windows. I only slows your program down by about half, which is pretty good compared to the Windows equivalents.
  3. Think about using the Google Performance Tools as a replacement malloc/new.
  4. Have you cleaned out all your object files and started over? Perhaps your make file is... "suboptimal"
  5. You're not assert()ing enough in your code. How do I know that without having seen it? Like flossing, no-one assert()s enough in their code. Add in a validation function for your objects and call that on method start and method end.
  6. Are you compiling -wall? If not, do so.
  7. Find yourself a lint tool like PC-Lint. A small app like yours might fit in the PC-lint demo page, meaning no purchase for you!
  8. Check you're NULLing out pointers after deleteing them. Nobody likes a dangling pointer. Same gig with declared but unallocated pointers.
  9. Stop using arrays. Use a vector instead.
  10. Don't use raw pointers. Use a smart pointer. Don't use auto_ptr! That thing is... surprising; its semantics are very odd. Instead, choose one of the Boost smart pointers, or something out of the Loki library.
Josh
+1, good list! However, I'd dispute #8 - while it prevents 'bad' accesses, it's actually a code smell that hides poor logic or poor object lifetime management in my experience...
Roddy
A: 

Hey Josh, thanks for all the advice. Regarding arrays, that's the only one I use. I'm using it as a map, so I need to use coordinates with it. There's probably still a better way (a multimap, perhaps?), but rest well at night knowing it's the only array I use. :')

Bernard
A: 

Other than tools like Boundschecker or Purify, your best bet at solving problems like this is to just get really good at reading code and become familiar with the code that you're working on.

Memory corruption is one of the most difficult things to troubleshoot and usually these types of problems are solved by spending hours/days in a debugger and noticing something like "hey, pointer X is being used after it was deleted!".

If it helps any, it's something you get better at as you gain experience.

Your memory allocation for the array looks correct, but make sure you check all the places where you access the array too.

17 of 26
+4  A: 

We once had a bug which eluded all of the regular techniques, valgrind, purify etc. The crash only ever happened on machines with lots of memory and only on large input data sets.

Eventually we tracked it down using debugger watch points. I'll try to describe the procedure here:

1) Find the cause of the failure. It looks from your example code, that the memory for "exampleString" is being corrupted, and so cannot be written to. Let's continue with this assumption.

2) Set a breakpoint at the last known location that "exampleString" is used or modified without any problem.

3) Add a watch point to the data member of 'exampleString'. With my version of g++, the string is stored in _M_dataplus._M_p. We want to know when this data member changes. The GDB technique for this is:

(gdb) p &exampleString._M_dataplus._M_p
$3 = (char **) 0xbfccc2d8
(gdb)  watch *$3
Hardware watchpoint 1: *$3

I'm obviously using linux with g++ and gdb here, but I believe that memory watch points are available with most debuggers.

4) Continue until the watch point is triggered:

Continuing.
Hardware watchpoint 2: *$3

Old value = 0xb7ec2604 ""
New value = 0x804a014 ""
0xb7e70a1c in std::string::_M_mutate () from /usr/lib/libstdc++.so.6
(gdb) where

The gdb where command will give a back trace showing what resulted in the modification. This is either a perfectly legal modification, in which case just continue - or if you're lucky it will be the modification due to the memory corruption. In the latter case, you should now be able to review the code that is really causing the problem and hopefully fix it.

The cause of our bug was an array access with a negative index. The index was the result of a cast of a pointer to an 'int' modulos the size of the array. The bug was missed by valgrind et al. as the memory addresses allocated when running under those tools was never "> MAX_INT" and so never resulted in a negative index.

Richard Corden