views:

762

answers:

5

I have been reading about out of memory conditions on Linux, and the following paragraph from the man pages got me thinking:

By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no guarantee that the memory really is available. This is a really bad bug. In case it turns out that the system is out of memory, one or more processes will be killed by the infamous OOM killer. [...]

Considering that the operator new implementation will end up calling malloc at some point, are there any guarantees that new will actually throw on Linux? If there aren't, how does one handle this apparently undetectable error situation?

+15  A: 

It depends; you can configure the kernel's overcommit settings using vm.overcommit_memory.

Herb Sutter discussed a few years ago how this behavior is actually nonconforming to the C++ standard:

"On some operating systems, including specifically Linux, memory allocation always succeeds. Full stop. How can allocation always succeed, even when the requested memory really isn't available? The reason is that the allocation itself merely records a request for the memory; under the covers, the (physical or virtual) memory is not actually committed to the requesting process, with real backing store, until the memory is actually used.

"Note that, if new uses the operating system's facilities directly, then new will always succeed but any later innocent code like buf[100] = 'c'; can throw or fail or halt. From a Standard C++ point of view, both effects are nonconforming, because the C++ standard requires that if new can't commit enough memory it must fail (this doesn't), and that code like buf[100] = 'c' shouldn't throw an exception or otherwise fail (this might)."

James McNellis
`buf[100] = 'c'` won't fail or throw an exception. The application might be killed by the system around that time, but the application might be killed by the system at *any* time.
caf
+3  A: 

I think the malloc can still return NULL. The reason why is that there is a difference between the system memory available (RAM + swap) and the amount in your process's address space.

For example, if you ask for 3GB of memory from malloc on a standard x86 linux, it will surely return NULL since this is impossible given the amount of memory given to user space apps.

Evan Teran
That's what I thought, but I've seen some online reports where somone allocated 1TB of memory and unfortunately it succeeded.
rpg
I would say that was clearly a 64-bit machine. Even if there was no way that he had that much mem available, the address space of the process is bigger than 1TB.
Evan Teran
+7  A: 

You can't handle it in your software, pure and simple.

For your application you will receive a perfectly valid pointer. Once you will try to access it, it will generate a page fault in the kernel, the kernel will try to allocate a physical page for it and if it can't ... boom.

But as you see, all this happens inside the kernel, your application cannot see that. If it's a critical system you can disable the overcommit alltogether on the system.

246tNt
+1  A: 

Yes, there is one guarantee that new will eventually throw. Regardless of overcommit, the amount of address space is limited. So if you keep allocating memory, sooner or later you will run out of address space and new will be forced to throw.

R Samuel Klatchko
Can you back this up? Should be easy on 32-bit, even with PAE.
Jed Smith
@Jed - are you looking for a more indepth reasoning or do you want an example that demonstrates this? If the latter, just keep calling operator new in an infinite loop. With the libstdc++ from both gcc 3.4.5 and gcc 4.4.0 and allocating 1 meg buffers, a bad_alloc exception was thrown after 3,055 calls.
R Samuel Klatchko
+1  A: 

Forgive me if I'm wrong, but wouldn't trying to zero out the memory allocated be enough to guarantee that you have every single byte you requested? Or even just writing to the last byte, it would throw an exception if the memory wasn't really yours right?

If that's true, you could just try writing to the last (and first?) byte of the memory and see if it works fine, and if it doesn't you could return null from malloc.

Blindy
You'd need to touch every page, at least. Just the first and last pages aren't enough as it will only truly allocate on a page-by-page basis (at least on most sane machines)
Steven Schlansker
Steven is correct, that you have to touch every page. However, the problem with this approach in general is that if commitment of the backing store fails during this process, you can't do anything about it because your application will crash with a segmentation fault because you'll try writing some memory you don't actually own. You can't recover from a segmentation fault (sometimes you can handle the `sigsegv` signal, but you definitely can't continue execution after handling it).
James McNellis
It won't fail with a segmentation fault. A target of the OOM killer dies with `SIGKILL` - you shouldn't consider this any differently to being killed manually by the superuser.
caf