views:

788

answers:

13

This is quite strange for me, but I'm getting an unexpected and random segmentation fault when I launch my program. Some times it works, some times it crashes.. The debugger of Dev-C++ points me to a line of the file : stl_construct.h

/**
   * @if maint
   * Constructs an object in existing memory by invoking an allocated
   * object's constructor with an initializer.
   * @endif
   */
  template<typename _T1, typename _T2>
    inline void
    _Construct(_T1* __p, const _T2& __value)
    {
      // _GLIBCXX_RESOLVE_LIB_DEFECTS
      // 402. wrong new expression in [some_]allocator::construct
     -> ::new(static_cast<void*>(__p)) _T1(__value);
    }

I'm using the STL extensively by the way.. What should i do to detect the origin of the segfault? Are there any tools that can help? What are the reasons that can lead to random crashes like this.

Edit:

My program counts around 5000 lines of code. I don't know what piece of code I have to show in order to get some help since I have no clue about the origin of the problem, all I got from the debugger is that it has to do with the STL.

Edit:

I moved to Code::Blocks now, here is the call stack:

#0 00464635 std::_Construct<std::pair<double const, int>, std::pair<double const, int> >(__p=0xb543e8, __value=@0x10) (C:/Program Files/CodeBlocks/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../include/c++/3.4.5/bits/stl_construct.h:81)
#1 00462306 std::_Rb_tree<double, std::pair<double const, int>, std::_Select1st<std::pair<double const, int> >, std::less<double>, std::allocator<std::pair<double const, int> > >::_M_create_node(this=0x406fe50, __x=@0x10) (C:/Program Files/CodeBlocks/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../include/c++/3.4.5/bits/stl_tree.h:367)
#2 00461DA7 std::_Rb_tree<double, std::pair<double const, int>, std::_Select1st<std::pair<double const, int> >, std::less<double>, std::allocator<std::pair<double const, int> > >::_M_clone_node(this=0x406fe50, __x=0x0) (C:/Program Files/CodeBlocks/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../include/c++/3.4.5/bits/stl_tree.h:379)
#3 004625C6 std::_Rb_tree<double, std::pair<double const, int>, std::_Select1st<std::pair<double const, int> >, std::less<double>, std::allocator<std::pair<double const, int> > >::_M_copy(this=0x406fe50, __x=0x0, __p=0x406fe54) (C:/Program Files/CodeBlocks/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../include/c++/3.4.5/bits/stl_tree.h:1029)
#4 00462A9D std::_Rb_tree<double, std::pair<double const, int>, std::_Select1st<std::pair<double const, int> >, std::less<double>, std::allocator<std::pair<double const, int> > >::_Rb_tree(this=0x406fe50, __x=@0xb59a7c) (C:/Program Files/CodeBlocks/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../include/c++/3.4.5/bits/stl_tree.h:559)
#5 0045A928 std::map<double, int, std::less<double>, std::allocator<std::pair<double const, int> > >::map(this=0x406fe50, __x=@0xb59a7c) (C:/Program Files/CodeBlocks/MinGW/bin/../lib/gcc/mingw32/3.4.5/../../../../include/c++/3.4.5/bits/stl_map.h:166)
#6 0040B7E2 VehicleManager::get_vehicles_distances(this=0xb59a50) (C:/Program Files/CodeBlocks/MinGW/projects/AHS/VehicleManager.cpp:232)
#7 00407BDA Supervisor::IsMergeInstruction(id_vehicle=1) (C:/Program Files/CodeBlocks/MinGW/projects/AHS/Supervisor.cpp:77)
#8 00408430 CheckingInstructionsThread(arg=0x476100) (C:/Program Files/CodeBlocks/MinGW/projects/AHS/Supervisor.cpp:264)
#9 00413950 _glfwNewThread@4() (??:??)
#10 75A24911    KERNEL32!AcquireSRWLockExclusive() (C:\Windows\system32\kernel32.dll:??)
#11 00476100    std::__ioinit() (??:??)
#12 0406FFD4    ??() (??:??)
#13 76E5E4B6    ntdll!RtlInitializeNtUserPfn() (C:\Windows\system32\ntdll.dll:??)
#14 00476100    std::__ioinit() (??:??)
#15 70266582    ??() (??:??)
#16 00000000    ??() (??:??)

A few more precisions :

1/ It's a multi-threaded application. 2/ The method : get_vehicles_distances(); returns a map. 3/ It's possible that the map isn't initalised by the time when it's called by IsMergeInstruction();

Edit:

Apparently the line that is causing the segfault is :

vehicles_distances_.erase(vehicles_distances_.begin(), vehicles_distances_.end());

Where vehicles_distances_ is the Map. This line is a part of the method : VehicleManager::MoveAllVehicles();

void VehicleManager::MoveAllVehicles() {

     vehicles_distances_.erase(vehicles_distances_.begin(), vehicles_distances_.end());

     vector<Vehicle>::iterator iter_end = VehicleManager::vehicles_.end();
     for(vector<Vehicle>::iterator iter = VehicleManager::vehicles_.begin();
     iter != iter_end; ++iter) {

          (*iter).MoveVehicle();

          vehicles_distances_[(*iter).get_vec_vehicle_position().y] = (*iter).get_id_vehicle();

     }

}

What is wrong with that ?

Edit:

I tried to use map::clear(); as a replacement to map::erase(); but the same problem occurs!

Edit:

I think i get it... A thread is trying to make a use of vehicles_distances_ while it's cleared.. (?)

Edit:

Problem solved! So it was coming from the map::erase(); as expected. i bypassed the problem by creating another map variable where the pair <key, value> was inverted so i can update the map. (since the key that i need is the distance, and the distance isn't unique since it changes everytime but the id_vehicle is unique!). At the end i just took that map, inverted the <key, value> again and transferred it to the original map that can be redeclared in each cycle...

Thanks everyone !

+7  A: 

You might try Valgrind as a way to help find the problem. Given the line that you've put in the question, I'd have to guess that you've either corrupted the heap, or you have a stack problem.

Michael Kohne
Valgrind is Linux only, and the OP sounds like he's using Windows.
Adam Rosenfield
What says Windows in the original post? Dev-C++? I'm not familiar with it, but a quick Google says it's Windows or Linux. Is it just that no one on Linux actually uses it?
Michael Kohne
yeah sorry i'm actually using Windows :).
Amokrane
Shoot. OK, so Valgrind is out. I found this thread, and apparently there are some answers: http://lists-archives.org/mingw-users/04129-best-memory-checker.html
Michael Kohne
problem apparently found.. i don't understand why it causes a segfault and don't know how to fix the problem since i really need to erase the map in each cycle !
Amokrane
+2  A: 

This is incredibly vague, so it's almost impossible to answer. One obvious suggestion would be to check if you're initializing all your variables. Some compilers will zero out your uninitialized stuff in debug and not do that in release for example, leading to random and unexpected results.

toastie
A: 

The debugger should let you go up the call stack. This way you should be able to see the place in your own code which is causing the seg fault.

Dima
but, the problem could have occurred at some time in the past.. by the time the actual crash occurs there may not be any useful information in the stack.
sean riley
+5  A: 

The obvious question would be "what is _p". In the debugger you should be able to look at the callstack. Follow _p back to its origin. Confirm its the correct size, that it hasn't been deleted, and that it does in fact exist.

If that's not easy, there's always the brute force methods of commenting out random (suspected) code until it works or going back and diffing against a known, working copy.

Doug T.
+1  A: 

what does the stack trace tell you after running the debugger with the core file? run gdb the normal way gdb a.out

then examine the core file

core a.out.core

And take a look at the stack

bt

AndrewB
+8  A: 

First, for the love of all that is, well, lovable, don't use Dev-C++. I wish I knew how people keep running into that piece of junk. It hasn't been maintained for years, and even when it was maintained, it was still a buggy piece of junk that lacked very basic functionality. Ditch it, and go for one of the countless better free alternatives.

Now, onto your question: Your program segfaults randomly because you've done something illegal earlier. Don't do that. ;)

if your program writes out of bounds somewhere, anything might happen. It might hit an unallocated page, in which case you get a segfault. Or it might hit unused data on a page that is allocated to your process in which case it won't have any practical effect (unless it is properly initialized afterwards, overwriting your first, illegal, write, and you then try to read from it, expecting the original (invalid) value to still be there. Or it might hit data that's actually in use, in which case you'll get errors later, when the program tries to read that data.

Pretty much the same scenarios exist when reading data. You can be lucky and get a segfault immediately, or you can hit unused and uninitialized memory, and read garbage data out (which will most likely cause an error later, when that data is used), or you can read from memory addresses that are already in use (which will also give you garbage out).

So yes, these errors are tricky to find. The best advice I can give is to 1) sprinkle asserts all over your code to ensure basic invariants are maintained, and 2) step through the program, and at every step, verify that you're not reading or writing to addresses that don't belong to you.

MSVC has a secure SCL option enabled by default which will perform bounds checking on STL code, which can help spotting errors like this.

I believe GCC has an option to do something similar (but it's not enabled by default).

Segfaults are nasty. Once people have been bitten by an error like this a few times, they tend to become a lot more disciplined with avoiding accessing memory out of bounds. :)

jalf
I've followed your advice and i'm now using Code::Blocks. Call stack added!
Amokrane
In the "C++ For Dummies" books the author recommends it.
kitchen
+1 segfault on any system ( Window, Unix(like), OpenVMS occured "long time ago" before the crash.
Luc M
I think i just found where the problem is, but don't know how to fix it :(
Amokrane
+1  A: 

This is most probably related to invalidated iterator - search for places where you iterate over containers and/or keep iterators into containers and remove/insert elements at the same time.
And as everybody else noted - use a call-stack to find the exact spot.

Nikolai N Fetissov
+2  A: 

You can use _CrtSetDbgFlag() at the beginning of your program to enable some heap debugging options. See also The CRT Debug Heap. This may help you track down where you're doing bad stuff with memory. It's available in Microsoft's C runtime, which you the MinGW compiler links against by default. If you're instead using GNU's C runtime, then you won't be able to go this route.

Adam Rosenfield
+2  A: 

The problem is much more likely to be in your code than in stl_construct.h. I'm assuming that file is part of the STL distribution for Dev-C++. The segmentation fault may be happening in code that was instantiated using the templates in stl_construct.h, but the root cause of the problem is going to be elsewhere. I would try to solve this by getting the stack trace at the time of crash. For each function in the stack trace (especially the ones which are newly written), try to examine the code and look for the following types of possible errors:

  • Uninitialized variables (especially indices used for array access)
  • Memory used after it has been freed or deleted.
  • Array access outside the bounds of the array
  • Memory allocation that is used without being checked for NULL (not a problem if you use new because that doesn't return NULL, it throws an exception)
A. Levy
A: 

It looks like a dangerous function :)

One thing striked me though. Where does the allocated memory go? Intuitively I'd like to have a pointer to a pointer as first argument and then dereference it. Like this:

template<typename _T1, typename _T2>
    inline void
    _Construct(_T1** __p, const _T2& __value)
    {

       ::new(static_cast<void*>(*__p)) _T1(__value);
    }

Alternatively, a reference to a pointer:

template<typename _T1, typename _T2>
    inline void
    _Construct(_T1*& __p, const _T2& __value)
    {

       ::new(static_cast<void*>(__p)) _T1(__value);
    }
Magnus Skog
+1  A: 

The first thing that you must do when you end up in code that has been tested thoroughly which is not yours is going up the call stack until you end up in your own code, where you will find the information that caused this problem to happen.

In the call stack the most important location to look is at your code (the caller) and the parameters you passed to the function you called (the callee).

Without this information, we can't help you more. ;-)

More on segmentation faults: http://en.wikipedia.org/wiki/Segmentation_fault
(Must-read, also see the links at "See also" and "External links")

TomWij
Call stack added (see edit notes). So if i understood the problem might come from VehicleManager::get_vehicles_distances ?
Amokrane
+1  A: 

Maps are not very thread friendly. If you perform map operations in thread code you really need to lock all accesses to that map and realize any iterators you may hold could be invalidated.

It seems that the problem is coming from there actually.. I'm trying to lock thread access to vehicles_distances_ (the map) but unsuccessfully :(
Amokrane
I don't use boost so it may have something like this:create a templated wrapper class sort of like locked<Type> which contains a mutex and the data type itself. The changing of the datatype will cause all consuming code to break, which you can use to find any and all accesses to both add the lock/access/unlock and to check that copies of the used data is being copied and not referred to.Does the program run if theading is disabled?
+1  A: 

Your description, and the call stack, suggest that the program is crashing during initialization of static variables. This is a common pitfall of C++ : There is no simple way to control the order of initialization for static variables.

For example, your program may have object foo that depends on object bar; but it may be that the program is calling the constructor for foo before it constructs bar. That would be bad, and could cause the kind of problem you're describing. (Is CheckingInstructionsThread a static variable that spawns a thread? That could be the problem right there.)

To fix this, you might need to look through your program's .cpp files for static variables (including class statics and globals), especially those that are of some class type. It may or may not help to modify your constructors to write some traces to stderr; use fprintf rather than cerr if you do so.

EDIT: If you're unsure whether it has anything to do with statics, try putting a line like this at the beginning of main():

 fprintf(stderr, "main() entered\n");

This wouldn't rule out static initialization as the cause of the problem; even if it doesn't crash before main(), you could still have data structures being set up incorrectly. But, if you never get to the fprintf, then you know that static initialization is the cause.

Dan Breslau
When i run it, it displays me : main() entered..
Amokrane
It's getting past the static initialization step, then; but as I wrote, that doesn't rule static initialization out as the problem. *Do* you have class statics, file statics, or globals in your program? Are there any with non-trivial constructors?
Dan Breslau