views:

301

answers:

7

Hi,

My application loads a data set of approx. 85bm to 100mb each time. The application's memory limit is set to 512mb, and this is, theoretically, more than enough.

However, I found that if, in a single run of the application, I opened and closed the data set 5 times, the total memory consumption steadily increases, until I get an out-of-memory error:

 PID USER  PR NI VIRT RES SHR S %CPU %MEM TIME+   COMMAND
6882 bguiz 20 0 679m 206m 19m S 30   13.7 0:30.22 java
6882 bguiz 20 0 679m 259m 19m S 9    17.2 0:55.53 java
6882 bguiz 20 0 679m 301m 19m S 9    20.0 1:20.04 java
6882 bguiz 20 0 679m 357m 19m S 33   23.7 1:44.74 java
6882 bguiz 20 0 679m 395m 19m S 80   26.2 2:10.31 java

Memory grew from ~14% to ~26%. It looks like a memory leak.

What's happening is that the top level data that is being loaded is used to populate collections such as maps and lists, and then the more detailed data is used to create sub-objects of these top-level objects, and then they in turn create sub-sub-objects.

When the data set is closed, currently the application does indeed make an attempt to clear its tracks by de-populating the various collections of objects, and then explicitly calling System.gc();


Anyhow, this is the state of the application when I got to it (several years in the making before me), and I have been assigned this task.

What I need to do, is to find a way to find which sub-objects and sub-sub-objects are still referencing each other after the data set is unloaded, and rectify them.
Obviously this can be done manually, but would be very very tedious, but I felt it would be a much better option to do this by memory profiling, something which I haven't done before.

I have read some other SO questions that asked about which memory profiling tools to use, and I have chosen to go with the one built into Netbeans IDE, since it seemed to have good reviews, and I am working in Netbeans anyway.

Has anyone undertaken a similar Java memory profiling task before, and with hindsight:

  • What specific advice would you give me?
  • What techniques did you find useful in tackling this problem?
  • What resources did you find useful in tackling this problem?


Edit: This application is a standard desktop application - not a web application.


Edit: Implemented solution

Basically what worked for me was to use Netbeans' profiler in conjunction with JHAT.

I found that the Profiler built into Netbeans IDE did a really good job of creating the memory dumps at particular profiling points, and then the tool was able to filter and sort by class and drill down the references for each instance. Which was all really good.

However, it didn't provide me with a means to compare two heap dumps. I asked a follow up question, and it looks like JHAT (comes as part of JDK) gets that job done quite well.

Thorbjørn Ravn Andersen, Dmitry and Jason Gritman: your input was really helpful, unfortunately I can only mark 1 as the correct answer, and all of you got +1 from me anyway.

A: 

The behaviour you are seeing is not necessarily a memory leak. Calling System.gc() is just a hint to the VM that the garbage collector should run (it does not have to) and as long as your heap has enough free space, the garbage collector usually does not run. So seeing the process size increasing is not a proof that the old data collections are not really claimable by the garbage collector. If you are not sure, I would recommend you to make a heap dump of the process (how depends on which Java VM you are using, but its documentation should tell you) and use an analysis tool on the heap dump to see if more object instances than expected are hold in the heap and from where they are referenced (this would also explain where the memory leak is).

As a first attempt, I would perhaps try to run the program with the new G1 garbage collector which is available since Java 1.6.0_14. Under normal circumstances it is probably better at reaching claimable instances earlier and also has the advantage of being able to return not required memory back to the operating system. Other Java garbage collectors have the problem that memory allocated from the OS is usually not returned until the process exits.

jarnbjo
He said he is getting OutOfMemory, which means that GC can't free anything (it must try to release before it throws OOM)
Dmitry
You are right, I overlooked that detail. In that case, I would have configured the VM to write a heap dump on OutOfMemoryError and examined the dump to find the remaining reference(s) to the non-required objects. No need to guess on where the problem may be (as you suggested) when the VM diagnostic tools offer functionality to find such bugs much easier.
jarnbjo
Well, using a profiler is not a guessing and it shows the data in much more readable way :)
Dmitry
+1  A: 

NetBeans profiler is probably the best out of free ones. It does its job well, there are no special hints on working with the profiler itself. With respect to your code, pay attention to hashmaps where you cache something (especially static ones). Very often they are the sources of memory leaks. Try using WeakHashMap for caching (roughly, it doesn't create strong references to the cached values).

Dmitry
Regarding HashMaps: attempt to allocate them large enough and with a fairly tight load factor. Each time a HashMap has to increase it size, it could temporarily eat up memory during the process.
jt
@Dimitry, yes indeed there is a heavy use of HashMaps going on, and thanks for the heads up of static ones. However, these things being loaded are not caches, and strong references are actually needed until the data set is closed. @jt Thanks for the pointer
bguiz
A: 

Are you altering the memory allocation when starting the application? For example:

java -Xmx512m -Xms128m foo.Bar

It is my understanding that an out-of-memory error can also occur when the JVM is not able to allocate memory fast enough. Even though it has a ceiling of 512m (in the above example), if the JVM is not able to allocate memory fast enough (beyond the initial 128M above), an out-of-memory error could occur. Starting with a higher -Xms value could alleviate that if it were the problem. It is also important to note that the Xms and Xmx values are suggestions and not hard-and-fast rules.

jt
@jt Yes indeed, the application is set to start using `-Xmx512m -Xms128m`, which is what we believe is a reasonable amount of memory for this particular application to utilise. H/W in this case `Xms` value is certainly not the issue, as the out of memory error occurs only upon opening and closing the data set for the 5th time; and that is why I think it's due to a memory leak.
bguiz
A: 

Look for static Collections (Map, Set, List) used as caches.

Jonathan Feinberg
+2  A: 

I wrote up an answer to another question about the techniques to go about finding memory leaks at http://stackoverflow.com/questions/1716597/java-memory-leak-detection-tools/1717260#1717260

If you follow my advice there, a tool like JProfiler can allow you to walk the reference graph of objects and view the deep size of those objects. This can help you find whatever object or objects are still holding onto the data.

I haven't worked with Netbeans, so I can't tell you how it stacks up against the other profilers I've used. If it doesn't look like it has that feature you can obtain a trial version of JProfiler easily, which should last you until you've found your leak.

Jason Gritman
@Jason, yeah your answer was one of those that I'd read priot to asking this question. I'll consider your suggestion of JProfiler if the Nebans profiler isn't able to do things like "walking the heap"
bguiz
Your situation is a little different than my previous answer, in that you know what actions cause the leak, but you need to find where the offending object is. I'd still say use a "find a hypothesis and test approach" by trying to reset, or dereference possible offenders to see if they are the culprit.
Jason Gritman
+2  A: 

Attach to your program with jvisualvm from the Java 6 JDK and see where your memory goes.

Thorbjørn Ravn Andersen
Add the relevant filter so that the count/size of objects specific to your application package can be viewed in visual vm. That is definitely useful.
techzen
A: 

The Eclipse Memory Analyzer is also a good standalone tool to analyze heap dumps.
You have several options to create a dump of the heap (e.g. on OutOfMemoryExceptions).

Turismo