ansaurus

Question

Answer 1

+2 A:

I think a session with a memory profiler will shed a lot of light on the subject. This gives a nice overview how many objects are created and this is somtimes revealing.

I am always amazed how many strings are generated.

For domain objects crossreferencing them is also revealing. If you see suddenly 3 times more objects from a derived object than from the source then there something going on there.

Netbeans has a nice one built it. I used JProfiler in the past. I think if you bang long enough on eclipse you can get the same info from the PPTP tools.

Peter Tillemans 2010-06-19 20:51:46

Does jvisualvm (usable with this question; it's Java6) help with identifying these problems?

Donal Fellows 2010-06-19 21:29:28

Good ideas, I will try the neatbeans profiler and jvisulvm. I am an eclipse guy but never had a ton of luck with PPTP

bwawok 2010-06-19 22:02:21

So.. 90% of my total memory is in char[] in "oracle.sql.converter.toOracleStringWithReplacement" So that narrows it down, but not sure how to narrow it down further, or if something like the flyweight pattern would reduce memory here.

bwawok 2010-06-21 22:38:30

Answer 2

+1 A:

In my opinion, the young generation should not be equally big as the old generation, so that the small garbage collections stay fast.

Do you have many objects that represent the same value? If you do, merge these duplicate objects using a simple HashMap:

public class MemorySavingUtils {

    ConcurrentHashMap<String, String> knownStrings = new ConcurrentHashMap<String, String>();

    public String unique(String s) {
        return knownStrings.putIfAbsent(s, s);
    }

    public void clear() {
        knownStrings.clear();
    }
}

With the Sun Hotspot compiler, the native String.intern() is really slow for large numbers of Strings, that's why I suggest to build your own String interner.

Using this method, strings from the old generation are reused and strings from the new generation can be garbage collected quickly.

Roland Illig 2010-06-19 20:55:41

Only worthwhile if you've got repetition of strings, especially within a batch. Otherwise you're not helping. (And don't use `String.intern` at all unless you *know* it is useful in the specific case you're dealing with; interning is an optimization…)

Donal Fellows 2010-06-19 21:27:27

1) I have tried a newRatio of 2 (the default), as well as 4 and 6. None of it helped. My GCs were slightly faster, but happened more often. 10 GCs of 5GB each seem to take just about as long as 100 GCs of 500MB each (I think the bigger GCs may have benchmarked slightly faster)

bwawok 2010-06-19 21:59:47

2) No strings should be a duplicate, or at least not very many of them. I know a few parts of the file that are one of 3 possible choices... I could specifically do an intern on those. Not sure if this is a micro-optimization though. Not worried about some church, just 10x the amount of my data set

bwawok 2010-06-19 22:00:53

Answer 3

+3 A:

It would be really usefull if you clarify your terms "young" and "tentured" generation because Java 6 has a slightly different GC-Model: Eden, S0+S1, Old, Perm

Have you experimented with the different garbage collection algorithms? How has "UseConcMarkSweepGC" or "UseParNewGC" performed.

And don't forget simply increasing the available space is NOT the solution, because a gc run will take much longer, decrease the size to normal values ;)

Are you sure you have no memory-leaks? In a consumer-producer-pattern - you describe - rarely seldom data should be in the Old Gen because those jobs are proccessed really fast and then "thrown away", or is your work queue filling up?

You should defintely observe your program with a memory analyzer.

Tobias P. 2010-06-19 21:01:39

I wouldn't use `UseConcMarkSweepGC` here, response time is not important for batch processing (see http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#available_collectors.selecting). Anyway, I don't think the problem is the GC algorithm.

Pascal Thivent 2010-06-19 21:44:11

I tried conr mark sweep, and I lost about 10% of my performance. I agree it's not great for batch processing.

bwawok 2010-06-19 21:54:18

I was using the terms young and tenured to refer to new and old returned to me by jmap -heap. I likely blew the terminology somewhere. I have 16 GB of ram to use, if I can go from 2GB of memory to 12GB of memory and get a 5-10% speedup, it is well worth it. Not sure I see a good reason to bring down the memory. I trade 10 slow GCs for 100 fast GCs... but spend the same time in GC. I think I need to reduce church and not my newgen size to increase my speed...

bwawok 2010-06-19 22:07:29

As to the memory leak issue. Could be, but don't think that is causing my problem. I cache 1-2 GB of data before my batch process, so 3-3.5 GB sitting in old gen is not a problem for me. My work queue is filling up, but it is bounded with a java.util.concurrent.BlockingQueue, so I make sure no more than ~10% of the file is in memory at any given point in time.

bwawok 2010-06-19 22:10:26

Answer 4

+1 A:

Read a line from a file, store as a string and put in a list. When the list has 1000 of these strings, put it in a queue to be read by worker threads. Have said worker thread make a domain object, peel a bunch of values off the string to set the fields (int, long, java.util.Date, or String), and pass the domain object along to a default spring batch jdbc writer

if that's your program, why not set a smaller memory size, like 256MB?

irreputable 2010-06-19 21:20:37

a) I precache a hashmap of data, that is around 1-2 GB of data (hence the stuff that lives in old gen). b) I have lots of memory and 16 threads, this program has the entire server to run on, not worried about "wasting" memory

bwawok 2010-06-19 21:55:34

Just because there is no other process running on that server doesn't mean your program should allocate all the memory. You should give it only as much memory as it needs, and a little extra for unexpected circumstances. That way, the garbage collector doesn't have to keep objects longer than necessary.

Roland Illig 2010-06-19 22:23:01

people say that GC performs very badly on a heap beyond a couple of GB. I don't understand why - GC works on live objects only, so why does it matter how many dead objects there are - but that what people say.

irreputable 2010-06-20 05:29:04

Answer 5

+1 A:

I'm guessing with a memory limit that high you must be reading the file entirely into memory before doing the processing. Could you consider using a java.io.RandomAccessFile instead?

Ceilingfish 2010-06-19 21:25:06

Actually I am not. In order to avoid "wasting" memory, I use a java.util.concurrent.BlockingQueue.. I keep just enough of the file read to keep all the workers busy, but I don't ever have more than about 10% of the file in memory at the same time. In theory I will scale to much bigger files, in the 10-30GB range, and def can not fit all of that in memory.

bwawok 2010-06-19 21:57:59

Answer 6

+2 A:

You need to profile your application to see what is happening exactly. And I would also try first to use the ergonomics feature of the JVM, as recommended:

2. Ergonomics

A feature referred to here as ergonomics was introduced in J2SE 5.0. The goal of ergonomics is to provide good performance with little or no tuning of command line options by selecting the

garbage collector,

heap size,

and runtime compiler

at JVM startup, instead of using fixed defaults. This selection assumes that the class of the machine on which the application is run is a hint as to the characteristics of the application (i.e., large applications run on large machines). In addition to these selections is a simplified way of tuning garbage collection. With the parallel collector the user can specify goals for a maximum pause time and a desired throughput for an application. This is in contrast to specifying the size of the heap that is needed for good performance. This is intended to particularly improve the performance of large applications that use large heaps. The more general ergonomics is described in the document entitled “Ergonomics in the 5.0 Java Virtual Machine”. It is recommended that the ergonomics as presented in this latter document be tried before using the more detailed controls explained in this document.

Included in this document are the ergonomics features provided as part of the adaptive size policy for the parallel collector. This includes the options to specify goals for the performance of garbage collection and additional options to fine tune that performance.

See the more detailed section about Ergonomics in the Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning guide.

Pascal Thivent 2010-06-19 21:57:42

Good idea, I will give ergonomics a choice and compare the results with what I have. However I know that by default it starts with a very small heap, and does gc, gc, grow heap, gc, gc, grow heap, gc, gc, grow heap... it is totally crappy. I think I shaved significant time off my run by starting XMS and XMX at the desired size.

bwawok 2010-06-19 22:16:08

@bwawok: I didn't mean to say "don't override `-Xms` and `-Xmx`"

Pascal Thivent 2010-06-20 16:00:27

Answer 7

+3 A:

I have a feeling that you are spending time and effort trying to optimize something that you should not bother with.

I am spending well over 5% of my program time frozen for minor GCs, and this seems excessive.

Flip that around. You are spending just under 95% of your program time doing useful work. Or put it another way, even if you managed to optimize the GC to run in ZERO time, the best you can get is something over 5% improvement.

If your application has hard timing requirements that are impacted by the pause times, you could consider using a low-pause collector. (Be aware that reducing pause times increases the overall GC overheads ...) However for a batch job, the GC pause times should not be relevant.

What probably matters most is the wall clock time for the overall batch job. And the (roughly) 95% of the time spent doing application specific stuff is where you are likely to get more pay-off for your profiling / targeted optimization efforts. For example, have you looked at batching the updates that you send to the database?

Stephen C 2010-06-20 01:17:49

Spring batch already does batching for me. I know this isn't the end-all be-all to make the program 100% faster thing... but the time spent in GC is over 5%, maybe even 6% or 7%. The wall clock time could be better, and that would help me...

bwawok 2010-06-20 14:40:21

ansaurus

tags:

views:

answers:

Ways to reduce memory churn

Background

Setup

Problem

2. Ergonomics

related questions