ansaurus

Question

Shrinking survivor spaces lead to continuos full GC

Answer 1

+3 A:

The key is probably PS Perm Generation which is at 99.999% (only 184 bytes out of 256*MB* free).

Usually, I'd suggest that you give it more perm gen but you already gave it 256MB which should be plenty. My guess is that you have a memory leak in some code generation library. Perm Gen is mostly used for bytecode for classes.

Aaron Digulla 2009-08-13 11:34:51

You're right, it does max out the `PermGen` space. I hadn't noticed that before, since I expected it to throw an `OutOfMemoryError`, as it usually does. I'll give it a shot.

Robert Munteanu 2009-08-13 11:43:06

As for having a code generation leak, I've (unfortunately) come to see this PermGen problem in any non-trivial web application. We're simply redeploying a Spring-based app.

Robert Munteanu 2009-08-13 11:45:16

Running out of PermGen after lots of redeploy is a common issue; but you can still figure out who leaks the classes. Just use profiler like you normally would but check for objects that don't get cleaned up (resources, DB connections, etc).

Aaron Digulla 2009-08-13 12:00:50

I'll take a heap dump when this happens next and see what happens.

Robert Munteanu 2009-08-13 13:05:50

Answer 2

+2 A:

I think this is not that uncommon for an application server that gets continuously deployed to. The perm gen space, which is full for you, is where classes go. Keep in mind that JSPs are compiled as Java classes, and when you change a JSP, a new class gets generated and loaded.

We have had this problem, and our solution is to have the app server restart occasionally.

This is what I'd do:

Deploy Hudson to a separate server from your staging server
Configure Hudson to restart your staging server from time to time. You can either do this one of two ways:
1. Restart periodically (e.g., every night at midnight, regardless of if there's build activity); or
2. Have the web app deployment job trigger the server restart job. If you do this make sure there's a really long quiet period for the restart job (we set ours to 2 hours), so that you don't get a server restart for every build (i.e., if two web app deployments happen within 2 hours, they'll only trigger one server restart).

Jack Leow 2009-08-13 11:45:51

I had this same exact problem, and used the same solution, for a staging version of my webapp which would be deployed by Hudson on each build. Solution? Just restart this Tomcat every hour, which is ok with us since it's only used for testing.

matt b 2009-08-13 12:35:30

Answer 3

+1 A:

The flag -XX:SurvivorRatio sets the ratio between Eden and the survivor spaces. According to the JDK 1.5 tuning doc, the default value is 32, which gives a 1:32 ratio. This is in accordance with what you're seeing. It seems incredibly small to me, although I understand that only a very small number of objects are expected to make their way from Eden to the survivor space.

So, assuming that you have a lot of long-lived objects, you should decrease the survivor ratio. The risk is that you only have those long-lived objects during a startup phase, and so are limiting the Eden size. For a testing server, I doubt this is going to be an issue.

I'd probably also reduce the size of the Eden space, by increasing -XX:NewRatio (the default is 3). My gut says that a hundred MB or so is sufficient for the young generation, and you'll just be increasing the cost of garbage collection to have such a large amount of space allocated (ie, object will live in Eden far too long). But that's just instinct, and should definitely be validated for your environment.

And a semi-related comment, after reading other replies: if you're not seeing errors for running out of permgen space, don't spend your time fiddling with it. The permgen is managed separately from the rest of the heap.

kdgregory 2009-08-13 11:47:16

That 14.1 is the HS build, not the java version. I'll update the question with the details.

Robert Munteanu 2009-08-13 11:56:43

OK, I'll remove that section of the post; thought it was strange and wanted to call it out.

kdgregory 2009-08-13 12:00:19

Answer 4

+2 A:

It's very easy to have ClassLoader leaks - all it takes is a single object loaded through the ClassLoader being referred by an object not loaded by it. A constantly redeployed app will then quickly fill PermGenSpace.

This article explains what to look out for, and a followup describes how to diagnose and fix the problem.

Michael Borgwardt 2009-08-13 11:51:18

ansaurus

tags:

views:

answers:

Shrinking survivor spaces lead to continuos full GC

related questions