views:

1104

answers:

4

I've had this troubling experience with a Tomcat server, which runs:

  • our Hudson server;
  • a staging version of our web application, redeployed 5-8 times per day.

The problem is that we end up with continous garbage collection, but the old generation is nowhere near to being filled. I've noticed that the survivor spaces are next to inexisting, and the garbage collector output is similar to:

[GC 103688K->103688K(3140544K), 0.0226020 secs]
[Full GC 103688K->103677K(3140544K), 1.7742510 secs]
[GC 103677K->103677K(3140544K), 0.0228900 secs]
[Full GC 103677K->103677K(3140544K), 1.7771920 secs]
[GC 103677K->103677K(3143040K), 0.0216210 secs]
[Full GC 103677K->103677K(3143040K), 1.7717220 secs]
[GC 103679K->103677K(3143040K), 0.0219180 secs]
[Full GC 103677K->103677K(3143040K), 1.7685010 secs]
[GC 103677K->103677K(3145408K), 0.0189870 secs]
[Full GC 103677K->103676K(3145408K), 1.7735280 secs]

The heap information before restarting Tomcat is:

Attaching to process ID 10171, please wait...
Debugger attached successfully.              
Server compiler detected.                    
JVM version is 14.1-b02                      

using thread-local object allocation.
Parallel GC with 8 thread(s)         

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 3221225472 (3072.0MB)
   NewSize          = 2686976 (2.5625MB)   
   MaxNewSize       = 17592186044415 MB    
   OldSize          = 5439488 (5.1875MB)   
   NewRatio         = 2                    
   SurvivorRatio    = 8                    
   PermSize         = 21757952 (20.75MB)   
   MaxPermSize      = 268435456 (256.0MB)  

Heap Usage:
PS Young Generation
Eden Space:
   capacity = 1073479680 (1023.75MB)
   used     = 0 (0.0MB)
   free     = 1073479680 (1023.75MB)
   0.0% used
From Space:
   capacity = 131072 (0.125MB)
   used     = 0 (0.0MB)
   free     = 131072 (0.125MB)
   0.0% used
To Space:
   capacity = 131072 (0.125MB)
   used     = 0 (0.0MB)
   free     = 131072 (0.125MB)
   0.0% used
PS Old Generation
   capacity = 2147483648 (2048.0MB)
   used     = 106164824 (101.24666595458984MB)
   free     = 2041318824 (1946.7533340454102MB)
   4.943684861063957% used
PS Perm Generation
   capacity = 268435456 (256.0MB)
   used     = 268435272 (255.99982452392578MB)
   free     = 184 (1.7547607421875E-4MB)
   99.99993145465851% used

The relevant JVM flags passed to Tomcat are:

-verbose:gc -Dsun.rmi.dgc.client.gcInterval=0x7FFFFFFFFFFFFFFE -Xmx3g -XX:MaxPermSize=256m

Please note that the survivor spaces are sized at about 40 MB at startup.

Any ideas on how I can avoid this problem would be appreciated.


Updates:

The JVM version is

$ java -version
java version "1.6.0_15"
Java(TM) SE Runtime Environment (build 1.6.0_15-b03)
Java HotSpot(TM) 64-Bit Server VM (build 14.1-b02, mixed mode)

I'm going to look into bumping up the PermGen size and seeing if that helps - probably the sizing of the survivor spaces was unrelated.

+3  A: 

The key is probably PS Perm Generation which is at 99.999% (only 184 bytes out of 256*MB* free).

Usually, I'd suggest that you give it more perm gen but you already gave it 256MB which should be plenty. My guess is that you have a memory leak in some code generation library. Perm Gen is mostly used for bytecode for classes.

Aaron Digulla
You're right, it does max out the `PermGen` space. I hadn't noticed that before, since I expected it to throw an `OutOfMemoryError`, as it usually does. I'll give it a shot.
Robert Munteanu
As for having a code generation leak, I've (unfortunately) come to see this PermGen problem in any non-trivial web application. We're simply redeploying a Spring-based app.
Robert Munteanu
Running out of PermGen after lots of redeploy is a common issue; but you can still figure out who leaks the classes. Just use profiler like you normally would but check for objects that don't get cleaned up (resources, DB connections, etc).
Aaron Digulla
I'll take a heap dump when this happens next and see what happens.
Robert Munteanu
+2  A: 

I think this is not that uncommon for an application server that gets continuously deployed to. The perm gen space, which is full for you, is where classes go. Keep in mind that JSPs are compiled as Java classes, and when you change a JSP, a new class gets generated and loaded.

We have had this problem, and our solution is to have the app server restart occasionally.

This is what I'd do:

  1. Deploy Hudson to a separate server from your staging server
  2. Configure Hudson to restart your staging server from time to time. You can either do this one of two ways:
    1. Restart periodically (e.g., every night at midnight, regardless of if there's build activity); or
    2. Have the web app deployment job trigger the server restart job. If you do this make sure there's a really long quiet period for the restart job (we set ours to 2 hours), so that you don't get a server restart for every build (i.e., if two web app deployments happen within 2 hours, they'll only trigger one server restart).
Jack Leow
I had this same exact problem, and used the same solution, for a staging version of my webapp which would be deployed by Hudson on each build. Solution? Just restart this Tomcat every hour, which is ok with us since it's only used for testing.
matt b
+1  A: 

The flag -XX:SurvivorRatio sets the ratio between Eden and the survivor spaces. According to the JDK 1.5 tuning doc, the default value is 32, which gives a 1:32 ratio. This is in accordance with what you're seeing. It seems incredibly small to me, although I understand that only a very small number of objects are expected to make their way from Eden to the survivor space.

So, assuming that you have a lot of long-lived objects, you should decrease the survivor ratio. The risk is that you only have those long-lived objects during a startup phase, and so are limiting the Eden size. For a testing server, I doubt this is going to be an issue.

I'd probably also reduce the size of the Eden space, by increasing -XX:NewRatio (the default is 3). My gut says that a hundred MB or so is sufficient for the young generation, and you'll just be increasing the cost of garbage collection to have such a large amount of space allocated (ie, object will live in Eden far too long). But that's just instinct, and should definitely be validated for your environment.


And a semi-related comment, after reading other replies: if you're not seeing errors for running out of permgen space, don't spend your time fiddling with it. The permgen is managed separately from the rest of the heap.

kdgregory
That 14.1 is the HS build, not the java version. I'll update the question with the details.
Robert Munteanu
OK, I'll remove that section of the post; thought it was strange and wanted to call it out.
kdgregory
+2  A: 

It's very easy to have ClassLoader leaks - all it takes is a single object loaded through the ClassLoader being referred by an object not loaded by it. A constantly redeployed app will then quickly fill PermGenSpace.

This article explains what to look out for, and a followup describes how to diagnose and fix the problem.

Michael Borgwardt