views:

862

answers:

9

What are the reasons a malloc() would fail, especially in 64 bit?

My specific problem is trying to malloc a huge 10GB chunk of RAM on a 64 bit system. The machine has 12GB of RAM, and 32 GB of swap. Yes, the malloc is extreme, but why would it be a problem? This is in Windows XP64 with both Intel and MSFT compilers. The malloc sometimes succeeds, sometimes doesn't, about 50%. 8GB mallocs always work, 20GB mallocs always fail. If a malloc fails, repeated requests won't work, unless I quit the process and start a fresh process again (which will then have the 50% shot at success). No other big apps are running. It happens even immediately after a fresh reboot.

I could imagine a malloc failing in 32 bit if you have used up the 32 (or 31) bits of address space available, such that there's no address range large enough to assign to your request.

I could also imagine malloc failing if you have used up your physical RAM and your hard drive swap space. This isn't the case for me.

But why else could a malloc fail? I can't think of other reasons.

I'm more interested in the general malloc question than my specific example, which I'll likely replace with memory mapped files anyway. The failed malloc() is just more of a puzzle than anything else... that desire to understand your tools and not be surprised by the fundamentals.

+3  A: 

Have you tried using heap functions to allocate your memory instead?

bdonlan
A: 

I dont know if it helps but malloc accepts a size_t argument. Check your limits.h Maybe the memory you are trying to allocate cannot fit in size_t or the libc attached to your process has been built for 32 bit.

Aditya Sehgal
-1, the question states that 8GB malloc calls always succeed. How would that be possible then?
mghie
+6  A: 

malloc tries to allocate a contiguous memory range, and this will initially be in real memory simply due to how swap memory works (at least as far as I remember). It could easily be that your OS sometimes can't find a contiguous block of 10gb of memory and still leave all the processes that require real memory in RAM at the same time (at which point your malloc will fail).

Do you actually require 10gb of contiguous memory, or would you be able to wrap a storage class/struct around several smaller blocks and use your memory in chunks instead? This relaxes the huge contiguous requirement and should also allow your program to use the swap file for less used chunks.

workmad3
Beat me to it by a fraction of a second ;) Quite right about breaking mallocs down into smaller segments, 10GB is a bit ahead of current mainstrean PCs.
Shane MacLaughlin
I don't buy it. 10GB of ram in a 64 bit virtual address space is micropeanuts.
Blank Xavier
blank is absolutely correct.. the address space in 64 bit is much, much, much bigger than a mere 10 GB! (Note that there may be a hidden 2^48 address size limit, which is still way bigger than 10GB !=2^33)
SPWorley
The address *space* wouldn't have any issues. But the available memory won't necesarilly fill the address space. AFAIK, memory can't be initially allocated in virtual memory, so if there isn't a way to page out the physical memory to have a 10gb contiguous block, the malloc will fail. Even if this wasn't the case there's only addressable memory in 12gb and 32gb, not the full 64 bit space available.
workmad3
@work, Any reference to this restriction that mallocs need to be less than the currently unused physical memory size? That does sound like it'd answer the question, but what I don't understand is why such a restriction would exist. Wouldn't that kill many regular (reasonable) mallocs of say 100MB if your OS just happened to be keeping a lot of throwaway file cache?
SPWorley
@workmad, this could well be the issue. If the virtual space isn't mapped in such away that its occurs directly after the end of the physical heap, you wouldn't be able to allocate a block that spans the gap. As you say, I'd guess that memory can't be initially virtual; it has to be real and subsequently swapped.
Shane MacLaughlin
At work now, just tried in Linux 32 and 64 and I can successfully malloc more than physical RAM. Perhaps it's a Windows heap library limitation? Quick test, can someone with a 1GB or less Windows machine try a single 1.1GB malloc?
SPWorley
For linux, you have to be careful about memory overcommit [ http://linux-mm.org/OverCommitAccounting ] when using huge mallocs.
Steve Schnepp
@Arno, this is mainly speculation based on your observed behaviour. I'm also not suggesting that the malloc needs to be less than the current unused physical memory, but less than the amount of contiguous memory the OS can get or free up in physical memory at the time. It may be just a Windows fault as well as memory management is normally seen as superior on *N*X systems.
workmad3
+4  A: 

Just a guess here, but malloc allocates contiguous memory and you may not have a sufficiently large contiguous section on your heap. Here's a few things I would try;

Where a 20GB malloc fails, do four 5GB mallocs succeed? If so, it is a contiguous space issue.

Have you checked your compiler switches for anything that limits total heap size, or largest heap block size?

Have you tried writing a program that declares a static variable of the required size? If this works you could implement your own heap with big mallocs in that space.

Shane MacLaughlin
I don't buy it - there's a 64 bit virtual address space. I can't see how the heap will have trouble finding 10GB contigious.
Blank Xavier
Possibly not, but if we believe the OPs post this is actually happening. If you can alloc the same amount of memory in smaller blocks, the likelihood is that the OS cannot provide large heap blocks that span real memory and swap space.
Shane MacLaughlin
Your virtual memory isn't limited by the 64-bit address space. It's limited by the size of your main memory and swap file, AFAIK.
Seun Osewa
+1  A: 

But why else could a malloc fail? I can't think of other reasons

As implicitly stated previously several times, because of memory fragmentation

dmityugov
+1  A: 

It is most likely fragmentation. For simplicity, let's use an example.

The memory consists of a single 12kb module. This memory is organised into 1kb blocks in the MMU. So, you have 12 x 1kb blocks. Your OS uses 100 bytes but this is basically the code that manages the page tables. So, you cannot swap it out. Then, your apps all use 100 bytes each.

Now, with just your OS and your application running (200 bytes), you would already be using 200 bytes of memory (occupying 2kb blocks). Leaving exactly 10kb available for malloc().

Now, you started by malloc() a couple of buffers - A (900 byte), B (200 byte). Then, you free up A. Now, you have 9.8kb free (non-contiguous). So, you try to malloc() C (9kb). Suddenly, you fail.

You have 8.9k contiguous at the tail end and 0.9k at the front end. You cannot re-map the first block to the end because B stretches over the first 1k and the second 1k blocks.

You can still malloc() a single 8kb block.

Granted, this example is a little contrived, but hope it helps.

sybreon
+2  A: 

Have you tried using VirtualAlloc() and VirtualFree() directly? This may help isolate the problem.

  • You'll be bypassing the C runtime heap and the NT heap.
  • You can reserve virtual address space and then commit it. This will tell you which operation fails.

If the virtual address space reservation fails (even though it shouldn't, judging from what you've said), Sysinternals VMMap may help explain why. Turn on "Show free regions" to look at how the free virtual address space is fragmented.

bk1e
A: 

Please show sources... I know many 64-bit puzzles... :)

Andrey Karpov www.Viva64.com

Viva64 is cool )
RandomNickName42
+1  A: 

Check out my answer here for large windows allocations, I include a reference to a MS paper on Vista/2008 memory model advancements.

In short, the stock CRT does not support, even for a native 64 bit process any heap size larger than 4gb. You have to use VirtualAlloc* or CreateFileMapping or some other analouge.

Oh I also notised you are claiming that your larger allocations are actually susseding, this is actually incorrect, you are mis-interpreting the malloc(0x200000000); (that's 8gb in hex), what is happining is you are requesting a 0 byte allocation due to a cast or some other effect of your test harness, you are most definately not observing any thing larger than a 0xfffff000 bytes heap being commited, it is simply you are seing integer overlfow's down casting.

WORD TO THE WYSE or TYPS TO SAVE YOUR HEAPSANITY

THE ONLY WAY TO ALLOCATE MEMORY WITH MALLOC (OR ANY OTHER DYNAMIC REQUEST)

void *foo = malloc(SIZE);

THE VALUE OF A DYNAMIC MEMORY REQUEST MUST NEVER (I CAN NOT STRESS THAT ENOUGH) BE CALCULATED WITHIN THE "()" PAREN'S OF THE REQUEST

mytype *foo = (mytype *) malloc(sizeof(mytype) * 2);

The danger is that an integer overflow will occur.

It is always a coding ERROR to perform arithmatic at the time of the call, you MUST ALWAYS calculate the TOTAL SUM of data to be requested before the statement which evaluates the request.

Why is it so bad? We know this is a mistake, because the point at which a request is made for dynamic resources, there must be a point in the future where we will use this resource.

To use what we have requested we must know how large it is ? (e.g. the array count, the type size, etc..).

This would mean, if we ever see any aritmatic at all, inside the () of a resource request, it is an error as we MUST duplicate that code again in order to use that data appropiately.

RandomNickName42
-1. Wrong on two points. (1) You claim you cannot allocate more than 4GB of memory with the CRT on Windows. Not on Win32, but you can on Win64. A simple experiment in VS2008 confirms that. Took me 5 minutes to check that (write the program, compile, step into the CRT to check the values and watch what the internal implementation does).(2) You claim that his size is rounding down when in actual fact it is not. The input value (size) to malloc is specified as size_t not as int. Thus on 64 bit Windows you can specify any valid 64 bit size. 8GB or 20GB are well within that range.
Stephen Kellett
Stephen: Paste your broken test case. I'll show you where you went wrong.
RandomNickName42