tags:

views:

51

answers:

4

I'm trying to sort a bunch of data such that that the size of data input to the program can be larger than the memory available to the JVM, and handling that requires external sort which is much slower than Quicksort.

Is there any way of obtaining memory available to the JVM at runtime such that I could use in place sorting as much as possible, and only switch to Mergesort when data input is too large?

+6  A: 

Check out these methods on the java.lang.Runtime class:

freeMemory

totalMemory

maxMemory

Example

Runtime rt = Runtime.getRuntime();
System.err.println(String.format("Free: %d bytes, Total: %d bytes, Max: %d bytes",
  rt.freeMemory(), rt.totalMemory(), rt.maxMemory()));

Also note that if the total amount of memory is being exhausted you can always start the JVM with a larger amount of heap allocated using the -Xmx JVM argument; e.g.

java -Xmx256M MyClass
Adamski
*"... if the total amount of memory is being exhausted you can always start the JVM with a larger amount of heap ..."*. True, but it would be a bad idea for a Java application to *expand* its own heap like this.
Stephen C
@Stephen C: Not quite sure what you mean. If an application legtimately requires more memory then what is wrong with increasing the allocated heap size?
Adamski
@Adamski: No. I mean if the application takes it upon itself to relaunch itself in a new JVM with a bigger heap.
Stephen C
Wow - I've never heard of an app doing that. How would it know that it had run out of memory in order to do that?
Adamski
+1  A: 

You can use the Runtime class to get the amount of memory available.

Runtime r = Runtime.getRuntime();
System.out.println(r.totalMemory());

There are various other memory details you can get from the Runtime object - see Runtime class.

Submonoid
+1  A: 

In theory yes, using Runtime.getRuntime().maxMemory().

In practice, there are some problem you need to address:

  1. You need to figure out how many application objects are going to fit in a given number of bytes of memory. AFAIK, there is no simple / efficient way to do this within a running application.

  2. You don't want to try to use all available heap space. If you push your percentage heap residency too high, you risk making the GC horribly inefficient.

  3. The maxMemory() method only tells you how big the heap in virtual memory. The physical size can also be a factor (especially if physical size << virtual size), and there's no portable way to figure that out.

If I was trying to implement this application, I think I'd probably just make the in-memory sort size a configuration parameter or command-line option.

Stephen C
+1 Good points. Regarding point 1, there is a possible (coplicated) solution here: http://www.javaspecialists.eu/archive/Issue142.html. However, an estimation is enough in the current case. Points (2) and (3) are extremely important for performance.
Eyal Schneider
A: 

Using the methods of Runtime, as the others suggested, is fine, as long as you take some things into consideration:

1) freeMemory() is a lower bound on the actual available memory, because memory that is unreferenced and ready for GC is considered as used. Running System.gc() before the call may return a more accurate result.

2) totalMemory() can change - it only indicates the current total heap size, and the heap can expand/shrink by the JVM during runtime, depending on its usage. You can use maxMemory() to get the actual maximum.

Eyal Schneider