tags:

views:

175

answers:

4

About once every three times I run my program, malloc reports a double free error; e.g.

myprogram(703,0xb06d9000) malloc: *** error for object 0x17dd0240: double free
*** set a breakpoint in malloc_error_break to debug

I've run the same code through valgrind more than a dozen times but it never reports a double free.

I ran the code through gdb with a breakpoint on malloc_error_break and (when the bug occurs) the error is always reported in a standard c++ library function. I isolated the parent function and valgrinded it in a test unit but no errors.

I think the parent function/standard c++ library is not to blame, it is simply freeing something it allocated but some other function in the parent program freed.

I've tried looking up which object is double freed but my gdb skills aren't up to finding the first object that was freed. Please help me find which object caused the first free and additionally any help to why my progam generates this error. Thank you.

The parent function boils down to:

int i;
double px, py;
int start, finish;
std::string comment;
std::vector<double> x, y;

std::fstream myfile;
myfile.open("filename.txt", std::ios_base::in);

// Read header

std::getline(myfile, comment);

// Read data

while(!myfile.eof())
{
  myfile >> comment >> start >> comment >> finish;

  for(i = 0; i <= finish-start; i++)
  {
    myfile >> px >> py;  // double free here

    x.push_back(px);
    y.push_back(py);
  }
}

EDIT: My data file is something like this:

Comment: My Data
start 33 end 36
10.2 139.0076
9.22616 141.584
8.62802 141.083
8.87098 141.813
start 33 end 35
300.354 405
301.698 404.029
303.369 403.953
start 33 end 35
336.201 148.07
334.616 147.243
334.735 146.09

The backtrace from gdb is

(gdb) backtrace
#0  0x93c2d4a9 in malloc_error_break ()
#1  0x93c28497 in szone_error ()
#2  0x93b52503 in szone_free ()
#3  0x93b5236d in free ()
#4  0x93b51f24 in localeconv_l ()
#5  0x93c18163 in strtod_l$UNIX2003 ()
#6  0x93c192e0 in strtod$UNIX2003 ()
#7  0x919b76e8 in std::__convert_to_v<double> ()
#8  0x919983cf in std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > >::do_get ()
#9  0x91991671 in std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > >::get ()
#10 0x9198d2dc in std::istream::operator>> ()

Just to reiterate, I need help to find which object was freed the first time, I'm not so interested in refactoring my code for this function - which I don't believe is causing the problem; unless you can find something catastrophic in it.

EDIT: Changed the example code.

A: 

try running gdb, cont to the crash point, then print the backtrace (type bt); see if that helps you point out where the problem is (note, you have to compile your program in debugging mode, g++ -g, to print a legible backtrace).

EDIT: On most machine, when you free/delete a memory location the pointer you're freeing is not NULL-ed. At the point where you're free-ing the memory for the second time, try adding a "= NULL", i.e.:

delete myPointer;
myPointer = NULL;

that does NOT fix the problem; however, it will isolate the possibility that the first free()/delete is also that exact same line (but on a previous execution, say, if you're on a loop).

btw, your code snippet doesn't contain any dynamically allocated memory (apart from std::string, which internally allocate memory dynamically).

Lie Ryan
When I set the breakpoint on malloc I ran to the point of the error and found where the second free it occuring. It doesn't help me find the first instance of free on this memory location.
koan
A: 

Have you taken a look into what is the value of start and finish and whether the file has enough contents to fill in vectors x and y?

A better approach would be to re-factor the looping logic -- you should break the moment you hit file EOF. At this point you have left it to faith.

Fanatic23
I simplified the code for clarity.
koan
+1  A: 

This line:

myfile >> comment

will only read "Comment:" and not the entire line "Comment: My Data". The next thing read from myfile after that line will be "my", which will likely cause problems.

In particular, the first time through the outer loop, it will attempt to read the string "Data" into start, and will be unable to do so (since it can't be parsed as an integer). So the statement:

myfile >> comment >> start >> comment >> finish;

will abort and both start and finish will be left uninitialized.

Depending on the (arbitrary) uninitialized values of start and finish, this could easily make your inner loop infinite. Inserting an infinite number of elements into a vector is likely to lead to strange behavior, although I don't see the crash that you do... it just runs for a very long time and I kill it because I run out of patience.

However, when I work around this bug by removing "My Data" from the first line, I can run your program 10,000 times without a crash.

Tyler McHenry
Thanks for your help, unfortunately this is the problem with making an example from your code: I am in fact using std::getline() to get that comment and I have now changed the example to reflect that. Apart from the double free error, my data does seem to be read perfectly fine
koan
Do you actually get the crash when running this example code? If not, then the example is pretty much useless.
Tyler McHenry
As I said right from the start, I isolated this code in a test unit and it does not cause any errors; I only included it for completeness. What I am looking for is help with gdb to find which object is double freed.
koan
+1  A: 

You appear to be using Mac OSX (you should have divulged that fact :-)

There are several environment variables which can help you debug heap corruption.

In particular, MallocStackLoggingNoCompact looks very promising.

Here is what I see:

$ cat t.c
int main()
{
  char *p = strdup("hello");
  free(p);
  free(p);
  return 0;
}

$ gdb ./a.out
GNU gdb 6.3.50-20050815 (Apple version gdb-967) (Tue Jul 14 02:11:58 UTC 2009)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-apple-darwin"...Reading symbols for shared libraries ... done

(gdb) set env MallocStackLoggingNoCompact 1
(gdb) b malloc_error_break
Breakpoint 1 at 0x13f44a9
(gdb) r
Starting program: /Users/emp-russian/a.out 
bash(22634) malloc: recording malloc stacks to disk using standard recorder
bash(22634) malloc: stack logging compaction turned off; size of log files on disk can increase rapidly
bash(22634) malloc: process 22536 no longer exists, stack logs deleted from /tmp/stack-logs.22536.a.out.8D3VZO
bash(22634) malloc: stack logs being written into /tmp/stack-logs.22634.bash.kjFTGa
arch(22634) malloc: recording malloc stacks to disk using standard recorder
arch(22634) malloc: stack logging compaction turned off; size of log files on disk can increase rapidly
arch(22634) malloc: stack logs deleted from /tmp/stack-logs.22634.bash.kjFTGa
arch(22634) malloc: stack logs being written into /tmp/stack-logs.22634.arch.8L8iLX
Reading symbols for shared libraries ++. done
Breakpoint 1 at 0x909b54a9
a.out(22634) malloc: recording malloc stacks to disk using standard recorder
a.out(22634) malloc: stack logging compaction turned off; size of log files on disk can increase rapidly
a.out(22634) malloc: stack logs deleted from /tmp/stack-logs.22634.arch.8L8iLX
a.out(22634) malloc: stack logs being written into /tmp/stack-logs.22634.a.out.s1qQRw
a.out(22634) malloc: *** error for object 0x100080: double free
*** set a breakpoint in malloc_error_break to debug

Breakpoint 1, 0x909b54a9 in malloc_error_break ()
(gdb) shell ls -l /tmp/stack-logs.22634.a.out.s1qQRw
total 16
-rw-------  1 emp-russian  wheel   96 Sep 12 09:42 stack-logs.index
-rw-------  1 emp-russian  wheel  208 Sep 12 09:42 stack-logs.stacks

(gdb) shell malloc_history 22634 0x100080

This first part of the history we don't actually care about:

Call [2] [arg=24]: thread_a0103720 |_dyld_start | dyldbootstrap::start(mach_header const*, int, char const**, long) | dyld::_main(mach_header const*, unsigned long, int, char const**, char const**, char const**) | dyld::initializeMainExecutable() | ImageLoader::runInitializers(ImageLoader::LinkContext const&) | ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) | ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) | ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) | ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) | libSystem_initializer | __keymgr_initializer | _dyld_register_func_for_add_image | dyld::registerAddCallback(void (*)(mach_header const*, long)) | dwarf2_unwind_dyld_add_image_hook | calloc | _malloc_initialize | malloc_set_zone_name | malloc_zone_malloc | __disk_stack_logging_log_stack | reap_orphaned_log_files | opendir$INODE64$UNIX2003 | __opendir2$INODE64$UNIX2003 | telldir$INODE64$UNIX2003 | malloc | malloc_zone_malloc 
Call [4] [arg=0]: thread_a0103720 |_dyld_start | dyldbootstrap::start(mach_header const*, int, char const**, long) | dyld::_main(mach_header const*, unsigned long, int, char const**, char const**, char const**) | dyld::initializeMainExecutable() | ImageLoader::runInitializers(ImageLoader::LinkContext const&) | ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) | ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) | ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) | ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) | libSystem_initializer | __keymgr_initializer | _dyld_register_func_for_add_image | dyld::registerAddCallback(void (*)(mach_header const*, long)) | dwarf2_unwind_dyld_add_image_hook | calloc | _malloc_initialize | malloc_set_zone_name | malloc_zone_malloc | __disk_stack_logging_log_stack | reap_orphaned_log_files | closedir$UNIX2003 | _reclaim_telldir | free | malloc_zone_free

But here is the interesting stuff:

Call [2] [arg=6]: thread_a0103720 |0x1 | start | main | strdup | malloc | malloc_zone_malloc 
Call [4] [arg=0]: thread_a0103720 |0x1 | start | main | free | malloc_zone_free 
Call [4] [arg=0]: thread_a0103720 |0x1 | start | main | free | malloc_zone_free 
Employed Russian
I wasn't hiding that I am using OSX, I just didn't think it was relevant: this was a "how to debug" question; besides isn't OSX just another Un*x ? :) Thanks to your tip, I can now see that a system thread is freeing the same memory. It looks like I will have to work around this problem. Thanks.
koan