views:

642

answers:

7

I have heard several people claiming that you can not scale the JVM heap size up. I've heard claims of the practical limit being 4 gigabytes (I heard an IBM consultant say that), 10 gigabytes, 32 gigabytes, and so on... I simply can not believe any of those numbers and have been wondering about the issue now for a while.

So, I have three part question I would hope someone with experience could answer:

  1. Given the following case how would you tune the heap and GC settings?
  2. Would there be noticeable hickups (pauses of JVM etc) that would be noticed by the end users?
  3. Should this really still work? I think it should.

The case:

  • 64 bit platform
  • 64 cores
  • 64 gigabytes of memory
  • The application server is client facing (ie. Jboss/tomcat web application server) - complete pauses of JVM would probably be noticed by end users
  • Sun JVM, probably 1.5

To prove I am not asking you guys to do my homework this is what I came up with:

  1. -XX:+UseConcMarkSweepGC -XX:+AggressiveOpts -XX:+UnlockDiagnosticVMOptions -XX:-EliminateZeroing -Xmn768m -Xmx55000m
  2. CMS should reduce the amount of pauses, although it comes with overhead. The other settings for CMS seem to default automatically to the number of CPUs so they seem sane to me. The rest that I added are extras that might do good or bad generally for performance, and they should probably be tested.
  3. Definitely.
+2  A: 

Obviously heap size is not unlimited and the larger is the heap size, the more your JVM will eventually spend on GC. Though I think it is possible to set heap size quite high on 64-bit JVM, I still think it's not really practical. The advice here is better to have several JVMs running with the same parameters i.e. cluster of JBoss/Tomcat nodes running on the same physical machine and you will get better throughput.

EDIT: Also your GC behavior depends on the taxonomy of your heap. If you have a lot of short-living objects and each request to the server creates a lot of those, then your GC will collect a lot of garbage very often and thus on large heap size this will result in longer pauses. If you have very many long-living objects (e.g. caching most of your data in memory) and the amount of short-living objects is not that big, then having bigger heap size is OK.

Superfilin
Yes I believe that is what the IBM consultant I heard of was after. Instead of selling installation work of 20 days for one server he saw a chance to sell 20 days times 5 for getting more JVMs...I myself feel that is very crude solution however. Also, the JVM should not spend more on GC compared to JVM size, especially as there are enough CPUs to handle things in parallel.
utteputtes
"... the larger is the heap size, the more your JVM will eventually spend on GC." In general, that is not true. With a modern JVM, if you use a larger heap (with the same set of live objects) you will actually spend less time garbage collecting.
Stephen C
@Stephen, I agree that's not true in general, but he is talking about web application which usually involves a lot of short-living objects.
Superfilin
@utteputtes, I don't know what IBM consultant had in mind, but "copy-pasting" job does not take 5 times more time :).
Superfilin
@utteputtes, we used the trick with several JVMs on 1.4 as it definitely hit the limit with 3-4GBs of heap, but that was 32-bit JVM.
Superfilin
Actually, it is true that the JVM will spend more time on GC with a larger heap, because mark-sweep garbage collection time is driven by the number of *live* objects. A large heap is only needed if you have a large number of live objects, therefore, it will take longer to GC.
kdgregory
And, at least with the Hotspot JVM, short-lived objects will be efficiently collected in the young generation (and I'm not sure that web-apps have more short-lived objects than any other type of app).
kdgregory
@kdgregory - The time taken by a copying collector (not mark and sweep!) is determined by the number of live objects in the space your are collecting. The efficiency, is determines by the ratio of live to non-live objects in the space. You can improve this ratio (hence the efficiency) by increasing the heap size and hence the from/to space sizes.
Stephen C
I'm not sure that you're disagreeing with me here, but as far as I know the copying collector (which does require marking) is only used for the young generation. The tenured generation continues to use a mark-sweep-compact, and as I said in my second comment, the only reason that you need a large heap is because you get a lot of objects in the tenured generation. And if all you want to do is change the young generation size, setting maximum heap size is the absolute last thing that I'd do: there are a bunch of parameters to adjust generation sizes independently.
kdgregory
+6  A: 

I think it's going to be difficult for anybody to give you anything more than general advice, without having further knowledge of your application.

What I would suggest is that you use VisualGC (or the VisualGC plugin for VisualVM) to actually look at what the garbage collection is doing when your app is running. Once you have a greater understanding of how the GC is working alongside your application, it'll be far easier to tune it.

Chris R
+1 one should analyze his application needs
Superfilin
A: 

I have found memory architecture plays a part in large memory sizes. Applications in general don't perform as well if they use more than one memory bank. The JVM appears to suffer as well, esp the GC which has to sweep the whole memory.

If you have an application which doesn't fit into one memory bank, your application has to pull in memory which is not local to a processor and use memory local to another processor.

On linux you can run numactl --hardware to see the layout of processors and memory banks.

Peter Lawrey
+4  A: 

#1. Given the following case how would you tune the heap and GC settings?

First, having 64 gigabytes of memory doesn't imply that you have to use them all for one JVM. Actually, it rather means you can run many of them. Then, it is impossible to answer your question without any access to your machine and application to measure and analyse things (knowing what your application is doing isn't enough). And no, I'm not asking to get access to your environment :)

#2. Would there be noticeable hickups (pauses of JVM etc) that would be noticed by the end users?

The goal of tuning is to find a good compromise between frequency and duration of (major) GCs. With a ~55g heap, GC won't be frequent but will take noticeable time, for sure (the bigger the heap, the longer the major GC). Using a Parallel or Concurrent garbage collector will help on multiprocessor systems but won't entirely solve this issue. Why do you need ~55g (this is mega ultra huge for a webapp IMO), that's my question. I'd rather run many clustered JVMs to handle load if required (at some point, the database will become the bottleneck anyway with a data oriented application).

#3. Should this really still work? I think it should.

Hmm... not sure I get the question. What is "this"? Instantiating a JVM with a big heap? Yes, it should. Is it equivalent to running several JVMs? No, certainly not.

PS: 4G is the maximum theoretical heap limit for the 32-bit JVM running on a 64-bit operating system (see Why can't I get a larger heap with the 32-bit JVM?)

PPS: On 64-bit VMs, you have 64 bits of addressability to work with resulting in a maximum Java heap size limited only by the amount of physical memory and swap space your system provides. (see How large a heap can I create using a 64-bit VM?)

Pascal Thivent
The thing with running many instead of one big is that it is not free lunch either. One has to set up session affinity, session storage for failover, shared caches (for application state failover), and in general administrate more logical servers.
utteputtes
That's true. However, you don't have any failover with **one** instance (so failover doesn't seem that important in your case and session persistence and other things seem irrelevant), you don't get scalability by increasing the heap size of one JVM so the load doesn't seem that important and I still can't say if you really need more than a couple of instances.
Pascal Thivent
+2  A: 

As Chris Rice already wrote, I wouldn't expect any obvious problems with the GC for heap sizes up to 32-64GB, although there may of course be some point of your application logic, which can cause problems.

Not directly related to GC, but I would still recommend you to perform a realistic load test on your production system. I used to work on a project, where we had a similar setup (relatively large, clustered JBoss/Tomcat setup to serve a public web application) and without exaggeration, JBoss is not behaving very well under high load or with a high number of concurrent calls if you are using EJBs. JBoss is spending a lot of time in synchronized blocks when accessing and managing the EJB instance pools and if you opt for a cluster, it will even wait for intra-cluster network communication within these synchronized blocks. Be especially aware of poorly performing state replication, if you are using SFSBs.

jarnbjo
A: 

Depending on your GC pause analysis, you may wish to implement Incremental mode whereby the long pause may be broken out over a period of time.

Xepoch
+1  A: 

Only to add some more switches I would use by default: -Xms55g can help to reduce the rampup time because it frees Java from the need to check if it can fall back to the initial size and allows also better internal initial sizing of memory areas.

Additionally we made good experiences with NewSize to give you a large young size to get rid of short term garbage: -XX:NewSize=1g Additionally most webapps create a lot of short time garbage that will never survive the request processing. You can even make that bigger. With Xms55g, the VM reserves a large chunk already. Maybe downsizing can help.

-Xincgc helps to clean the young generation incrementally and return the cpu often to the user threads.

-XX:CMSInitiatingOccupancyFraction=70 If you really fill all that memory, try to start CMS garbage collection earlier.

-XX:+CMSIncrementalMode puts the CMS into incremental mode to return the cpu to the user threads more often.

Attach to the process with jstat -gc -h 10 <pid> 1s and watch the GC working.

Will you really fill up the memory? I assume that 64cpus for request processing might even be able to work with less memory. What do you store in there?

ReneS