views:

99

answers:

9

I have a web app that serializes a java bean into xml or json according to the user request.

I am facing a mind bending problem when I put a little bit of load on it, it quickly uses all allocated memory, and reach max capacity. I then observe full GC working really hard every 20-40 seconds.

Doesnt look like a memory leak issue... but I am not quite sure how to trouble shoot this?

The bean that is serialized to xml/json has reference to other beans and those to others. I use json-lib and jaxb to serialize the beans.

yourkit memory profiler is telling me that a char[] is the most memory consuming live object...

any insight is appreciated.

A: 

It's impossible to diagnose this without a lot more information - code and GC logs - but my guess would be that you're reading data in as large strings, then chopping out little bits with substring(). When you do that, the substring string is made using the same underlying character array as the parent string, and so as long as it's alive, will keep that array in memory. That means code like this:

String big = a string of one million characters;
String small = big.substring(0, 1);
big = null;

Will still keep the huge string's character data in memory. If this is the case, then you can address it by forcing the small strings to use fresh, smaller, character arrays by constructing new instances:

small = new String(small);

But like i said, this is just a guess.

Tom Anderson
A: 

I'm not sure how much of it is in your code and how much might be in the tools you are using, but there are some key things to watch for.

One of the worst is if you constantly add to strings in loops. A simple "hello"+"world" is no problem at all, it's actually very smart about that, but if you do it in a loop it will constantly reallocate the string. Use StringBuilder where you can.

There are profilers for Java that should quickly point you to where the allocations are taking place. Just fool around with a profiler for a while while your java app is running and you will probably be able to reduce your GCs to virtually nothing unless the problem is inside your libraries--and even then you may figure out some way to fix it.

Things you allocate and then free quickly don't require time in the GC phase--it's pretty much free. Be sure you aren't keeping Strings around longer than you need them. Bring them in, process them and return to your previous state before returning from your request handler.

Bill K
+1  A: 

There are two possibilities: you've got a memory leak, or your webapp is just generating lots of garbage.

  • The brute-force way to tell if you've got a memory leak is to run it for a long time and see if it falls over with an OOME. Or turn on GC logging, and see if the average space left after garbage collection continually trends upwards over time.

  • Whether or not you have a memory leak, you can probably improve performance (reduce the percentage GC time) by increasing the max heap size. The fact that your webapp is seeing lots of full GCs suggests to me that it needs more heap. (This is just a bandaid solution if you have a memory leak.)

  • If it turns out that you are not suffering from a memory leak, then you should take a look at why your application is generating so much garbage. It could be down to the way that you are doing the XML and JSON serialization.

Stephen C
A: 

You should attach yourkit and record allocations (e.g., every 10th allocation; including all large ones). They have a step by step guide on diagnosing excessive gc: http://www.yourkit.com/docs/90/help/excessive_gc.jsp

Ron
A: 

To me that sounds like you are trying to serialize a recursive object by some encoder which is not prepared for it. (or at least: very deep/almost recursive)

blabla999
+1  A: 

Why do you think you have a problem? GC is a natural and normal thing to happen. We have customers that GC every second (for less than 100ms duration), and that's fine as long as memory keeps getting reclaimed.

GCing every 20-40 seconds isn't a problem IMO - as long as it doesn't take a large % of that 20-40s. Most major commercial JVMs aim to keep GC in the 5-10% of time range (so 1-4 seconds of that 20-40s). Posting more data in the form of the GC logs might help, and I'd also suggest tools like GCMV would help you visualize and get recommendations on what your GC profile looks like.

Trent Gray-Donald
good point Trent. but frequent GC means more work which is why I think its a problem.
bushman
Let's make this concrete - what's your % time spent in GC? I think that's the metric you really want to focus on to determine if you have a problem.
Trent Gray-Donald
A: 

Java's native XML API is really "noisy" and generally wasteful in terms of resources which means that if your requests and XML/JSON generation cycles are short-lived, the GC will have lots to clean up for.

I have debugged a very similar case and found out this the hard way, only way I could at least somewhat improve the situation without major refactorings was implicitly calling GC with the appropriate VM flags which actually turn System.gc(); from a non-op call to maybe-op call.

Esko
A: 

I would start by inspecting my running application to see what was being created on the heap. HPROF can collect this information for you, which you can then analyse using HAT.

Joel
A: 

To debug issues with memory allocations, InMemProfiler can be used at the command line. Collected object allocations can be tracked and collected objects can be split into buckets based on their lifetimes.

In trace mode this tool can be used to identify the source of memory allocations.

mchr