How big does a buffer need to be in Java before it's worth reusing?
Or, put another way: I can repeatedly allocate, use, and discard byte[] objects OR run a pool to keep and reuse them. I might allocate a lot of small buffers that get discarded often, or a few big ones that's don't. At what size is is cheaper to pool them than to reallocate, and how do small allocations compare to big ones?
EDIT:
Ok, specific parameters. Say an Intel Core 2 Duo CPU, latest VM version for OS of choice. This questions isn't as vague as it sounds... a little code and a graph could answer it.
EDIT2:
You've posted a lot of good general rules and discussions, but the question really asks for numbers. Post 'em (and code too)! Theory is great, but the proof is the numbers. It doesn't matter if results vary some from system to system, I'm just looking for a rough estimate (order of magnitude). Nobody seems to know if the performance difference will be a factor of 1.1, 2, 10, or 100+, and this is something that matters. It is important for any Java code working with big arrays -- networking, bioinformatics, etc.
Suggestions to get a good benchmark:
- Warm up code before running it in the benchmark. Methods should all be called at least
100010000 times to get full JIT optimization. - Make sure benchmarked methods run for at least
110 seconds and use System.nanotime if possible, to get accurate timings. - Run benchmark on a system that is only running minimal applications
- Run benchmark 3-5 times and report all times, so we see how consistent it is.
I know this is a vague and somewhat demanding question. I will check this question regularly, and answers will get comments and rated up consistently. Lazy answers will not (see below for criteria). If I don't have any answers that are thorough, I'll attach a bounty. I might anyway, to reward a really good answer with a little extra.
What I know (and don't need repeated):
- Java memory allocation and GC are fast and getting faster.
- Object pooling used to be a good optimization, but now it hurts performance most of the time.
- Object pooling is "not usually a good idea unless objects are expensive to create." Yadda yadda.
What I DON'T know:
- How fast should I expect memory allocations to run (MB/s) on a standard modern CPU?
- How does allocation size effect allocation rate?
- What's the break-even point for number/size of allocations vs. re-use in a pool?
Routes to an ACCEPTED answer (the more the better):
- A recent whitepaper showing figures for allocation & GC on modern CPUs (recent as in last year or so, JVM 1.6 or later)
- Code for a concise and correct micro-benchmark I can run
- Explanation of how and why the allocations impact performance
- Real-world examples/anecdotes from testing this kind of optimization
The Context:
I'm working on a library adding LZF compression support to Java. This library extends the H2 DBMS LZF classes, by adding additional compression levels (more compression) and compatibility with the byte streams from the C LZF library. One of the things I'm thinking about is whether or not it's worth trying to reuse the fixed-size buffers used to compress/decompress streams. The buffers may be ~8 kB, or ~32 kB, and in the original version they're ~128 kB. Buffers may be allocated one or more times per stream. I'm trying to figure out how I want to handle buffers to get the best performance, with an eye toward potentially multithreading in the future.
Yes, the library WILL be released as open source if anyone is interested in using this.