views:

340

answers:

3

Hi folks,

We have a linux system (kubuntu 7.10) that runs a number of CORBA Server processes. The server software uses glibc libraries for memory allocation. The linux PC has 4G physical memory. Swap is disabled for speed reasons.

Upon receiving a request to process data, one of the server processes allocates a large data buffer (using the standard C++ operator 'new'). The buffer size varies depening upon a number of parameters but is typically around 1.2G Bytes. It can be up to about 1.9G Bytes. When the request has completed, the buffer is released using 'delete'.

This works fine for several consecutive requests that allocate buffers of the same size or if the request allocates a smaller size than the previous. The memory appears to be free'd ok - otherwise buffer allocation attempts would eventually fail after just a couple of requests. In any case, we can see the buffer memory being allocated and freed for each request using tools such as KSysGuard etc.

The problem arises when a request requires a buffer larger than the previous. In this case, operator 'new' throws an exception. It's as if the memory that has been free'd from the first allocation cannot be re-allocated even though there is sufficient free physical memory available.

If I kill and restart the server process after the first operation, then the second request for a larger buffer size succeeds. i.e. killing the process appears to fully release the freed memory back to the system.

Can anyone offer an explanation as to what might be going on here? Could it be some kind of fragmentation or mapping table size issue? I am thinking of replacing new/delete with malloc/free and use mallopt to tune the way the memory is being released to the system.

BTW - I'm not sure if it's relevant to our problem, but the server uses Pthreads that get created and destroyed on each processing request.

Cheers,

Brian.

+4  A: 

You'll have 3Gb of address space at your disposal if this is a 32 bit machine - 1Gb is reserved for the kernel.Out of that, quite a bit of address space will be taken by shared libraries, the exe file, data segment etc. You should look at /proc/pid/maps to see how the address space is laid out.

How much of the physical address space is available is hard to tell, all the system processes, the kernel, and other processes of yours will eat into that. Assuming the sum of those is no more than 1Gb, you'll still have your 3Gb available.

What's might be happening is fragmentation:

0Gb                                                     3Gb
---------------------~------------------------------------
|Stuff | Heap,1.2Gb allocated stuff | free heap   | Stack|
---------------------~------------------------------------

You then free the large object, but in between some other memory have been allocated, leaving you with this:

0Gb                                                         3Gb
---------------------~------------------------------------------
|Stuff | Heap,1.2Gb free |small object(s) | free heap   | Stack|
---------------------~------------------------------------------

If you try to allocate a bigger object now, it won't fit in the free 1.2Gb space And might not fit in the free heap space either, as that might not have room enough.

If you're heavily using the stack, it might be the stack growing and eating space that otherwise could be used for heap space - though by default most distros limit the stack to 8-10Mb.

Using malloc/realloc will not help this. However , if you know the size of the largest object you need ,you could reserve that much at startup. That piece should never be free'd/deleted, it should just be reused. Whether that'll get you into other trouble elsewhere is hard to tell though - the space available for other objects will get smaller.

nos
A: 

Many thanks for the response. Further investigation indicates that fragmentation is indeed the problem. Reserving the largest buffer we will ever need when the first request arrives and keeping that buffer for subsequent requests appears to work.

brian_mk
A: 

You're running out of address space, so there is no single chunk big enough to satisfy your allocation.

The fix is to run a 64-bit operating system - after all, it is the 21st century!

Of course you will need to re-test your applications for 64-bit compatibility (and recompile etc) but it makes sense in the long term. 4G is not a lot of ram for a server these days; a fairly modest one has 16-32G.

MarkR