views:

196

answers:

6

I've written a library in C which consumes a lot of memory (millions of small blocks). I've written a c program which uses this library. And I've written a java program which uses the same library. The Java program is a very thin layer around the library. Basically there is only one native method which is called, does all the work and returns hours later. There is no further communication between Java and the native library using the java invocation interface. Nor there are Java object which consume a noteworthy amount of memory.

So the c program and the Java program are very similar. The whole computation/memmory allocation happens inside the native library. Still. When executed the c program consumes 3GB of memory. But the Java program consumes 4.3GB! (VIRT amount reported by top)

I checked the memory map of the Java process (using pmap). Only 40MB are used by libraries. So additional libraries loaded by Java are not the cause.

Does anyone have an explanation for this behavior?

EDIT: Thanks for the answers so far. To make it a little bit more clearer: The java code does nothing but invoke the native library ONCE! The java heap is standard size (perhaps 60MB) and is not used (except for the one class containing the main method and the other class invoking the native library).

The native library method is a long running one and does a lot of mallocs and frees. Fragmentation is one explanation I thought of myself too. But since there is no Java code active the fragmentation behavior should be the same for the Java program and the c program. Since it is different I also presume the used malloc implementations are different when run in c program or in Java program.

A: 

There are different factors that you need to take into account especially on a language like Java, Java runs on a virtual machine and garbage collection is handled by the Java Runtime, as there is considerable effort (I would imagine) from using the Java Invocation Interface to switch or execute the native method within the native library as there would have to be a means to allocate space on the stack, switch to native code, execute the native method, switch back to the Java virtual machine and perhaps somehow, the space on the stack was not freed up - that's what I would be inclined to think.

Hope this helps, Best regards, Tom.

tommieb75
+3  A: 

Just guessing: You might be using a non-default malloc implementation when running inside the JVM that's tune to the specfic needs of the JVM and produces more overhead than the general-purpose malloc in your normal libc implementation.

Joachim Sauer
That would be my guess too. But I'm not giving you the +1 until there is evidence that really is the case.
Omnifarious
+1  A: 

Java need to have continuous memory for its heap so it can allocate the maximum memory size as virtual memory. However, this doesn't consume physical memory and might not even consume swap. I would check how much your resident memory increases by.

Peter Lawrey
libraries called by the JVM can still free memory the allocate, though.
Chad Okere
One possibility is that there's a Java thread running concurrently with the native code and they are fragmenting memory. The OP says there are a lot of small allocations.
Omnifarious
The maximum heap size for Java is set to 60MB or so. This can't be the reason for consuming 1GB more memory. Still: resident memory is 3GB, like the c program.
Eduard Wirch
It could be using 1 GB virtual memory by the time all the shared libraries are included. The whole point of virtual memory is you don't need to worry about how memory is used, only the resident memory actually consumes physical memory. Why do you care how much address space it uses, it like caring how big the integers in your program are.
Peter Lawrey
A: 

It is hard to say, but I think at the heart of the problem is that there are two heaps in your application which need to be maintained -- the standard Java heap for Java object allocations (maintained by the JVM), and the C heap which is maintained by calls to malloc/free. It is hard to say what is going on exactly without seeing some code.

Joe M
A: 

Here is a suggestion for combating it.

Make the C code stop using the standard malloc call, and use an alternate version of malloc that grabs memory by mmaping /dev/zero. You can either modify an implementation of malloc from a library or roll your own if you feel competent enough to do that.

I strongly suspect you will discover that your problem goes away after you do that.

Omnifarious
+1  A: 

Sorry guys. Wrong assumptions.

I got used to the 64MB the Sun Java implementations used to use for default maximum heap size. But I used openjdk 1.6 for testing. Openjdk uses a fraction of the physical memory if no maximum heap size was explicitly specified. In my case one fourth. I used a 4GB machine. One fourth is thus 1GB. There it is the difference between C and Java.

Sadly this behavior isn't documented anywhere. I found it looking at the source code of openjdk (arguments.cpp):

// If the maximum heap size has not been set with -Xmx,
// then set it as fraction of the size of physical memory,
// respecting the maximum and minimum sizes of the heap.
Eduard Wirch