views:

157

answers:

3

I have a program that implements several heuristic search algorithms and several domains, designed to experimentally evaluate the various algorithms. The program is written in C++, built using the GNU toolchain, and run on a 64-bit Ubuntu system. When I run my experiments, I use bash's ulimit command to limit the amount of virtual memory the process can use, so that my test system does not start swapping.

Certain algorithm/test instance combinations hit the memory limit I have defined. Most of the time, the program throws an std::bad_alloc exception, which is printed by the default handler, at which point the program terminates. Occasionally, rather than this happening, the program simply segfaults.

Why does my program occasionally segfault when out of memory, rather than reporting an unhandled std::bad_alloc and terminating?

+4  A: 

One reason might be that by default Linux overcommits memory. Requesting memory from the kernel appears to work alright, but later on when you actually start using the memory the kernel notices "Oh crap, I'm running out of memory", invokes the out-of-memory (OOM) killer which selects some victim process and kills it.

For a description of this behavior, see http://lwn.net/Articles/104185/

janneb
Possibly. Some more info, I have been running as the only user on the test system, which has 48GB of memory. I've been running with a 47GB virtual memory ulimit, which should leave plenty of core memory for the OS. The linked-to article is from 2004. Is it still relevant today?
Bradford Larsen
A: 

What janneb said. In fact Linux by default never throws std::bad_alloc (or returns NULL from malloc()).

Nemanja Trifunovic
I assume you mean "std::bad_alloc is never thrown by default on Linux". Why, then, have I see std::bad_alloc thrown from C++ programs on several Linux systems when the program hits its memory limit?
Bradford Larsen
Also, I think you mean `malloc' rather than `free'. The Linux man page for malloc does not make it sound like NULL will never be returned.
Bradford Larsen
@Bradford. Of course, you are right. Fixed.
Nemanja Trifunovic
As I said it is the default behavior. Take a look at this thread: http://stackoverflow.com/questions/1592535/operator-new-and-bad-alloc-on-linux/1592545
Nemanja Trifunovic
I thought that even with overcommitting, linux does fail fast if you ask for more than some upper bound it knows it could never possibly satisfy, or if you ask for more than is left in the virtual memory space (that being quite a rare occurrence on a 64 bit system, I'd think), or more than can be contiguously reserved in address space (again, ain't gonna happen by accident on a 64 bit system) . I could be wrong of course.
Steve Jessop
@Steve This was a while ago, but IIRC, on my x86_64 desktop with 8 GB RAM my simple test program that allocated memory but never touched it, was able to allocate around 90 GB before malloc() returned NULL.
janneb
I didn't say it was a very tight upper bound that it knew it could never possibly satisfy :-). The presumption when over-committing is that applications tend to allocate more memory than they actually use, either because of true wasted space (e.g. exponential array growth) or just that the high-water mark of reserved space is different from the high-water mark of actual use. Think about copying a list into an array, deleting each list node as you do so. At any one time you only use half the memory you reserved. I guess 90GB was the point the kernel figured there was probably a problem.
Steve Jessop
@Steve: Linux's behavior depends on the settings of overcommit and overcommit ratio. Some distros change the default to turn off unlimited overcommit, but set the ratio to 1.5 or 2 times the amount of RAM.
Zan Lynx
+1  A: 

It could be some code using no-throw new and not checking the return value.

Or some code could be catching the exception and not handling it or rethrowing it.

Zan Lynx