ansaurus

Question

Can I replicate some of the optimisations done by the JVM by hand?

Answer 1

+1 A:

You might try to reduce the available heap memory so the garbage collector is faster warmed up. It takes a while before memory management is running in "normal" conditions and reducing the memory size can speed it up.

Peter Tillemans 2010-05-31 14:05:27

I've tried setting the intial heap memory using -Xms. The lowest I can go is 5mb, and according to the Runtime class, my solver uses about 500kb. It did not change the amount of time first run takes.

Subb 2010-05-31 14:42:50

It is especially the -Xmx which is important. And indeed it will probably not impact the first run, but hopefully terminal velocity does not require 10000 runs anymore.

Peter Tillemans 2010-05-31 14:51:52

Answer 2

A:

You can force garbage collection before "heavy" actions, like the main loop, by calling `System.gc();'

Little Bobby Tables 2010-05-31 14:07:27

Note that this is only a hint, which the GC may freely disregard.

Péter Török 2010-05-31 14:21:02

Calling System.gc() may also totally screw your performance. At least in the current Sun JVM it triggers a full GC, which when done enough times per second will spend more time in the GC than anywhere else (and since the OP said a run takes about 2.5ms, this would be such a case)

Durandal 2010-05-31 15:07:19

Answer 3

+2 A:

If you can have a "warmup" game to get the JIT to compile the game classes, you can add the options

-XX:CompileThreshold=20

To be sure that most hotspots are compiled on your first run. Subsequent runs will then run without any additional (or very little) compilation.

See

Java Hotspot VM Options

mdma 2010-05-31 14:37:17

Interesting link. Thanks!

Subb 2010-05-31 14:46:06

Interesting behavior too. This actually increased the timing when running the solver once, since it takes to compile those methods. I found a sweet spot around 500, which decrease the time of about 0.5ms but this is probably specific to this exact puzzle.

Subb 2010-05-31 15:05:08

Answer 4

A:

The JVM works with a JIT (Just-In-Time) compiler: it compiles Java bytecode to native machine code on the fly. It does a lot of sophisticated analysis to determine when and what to compile to native code. One of the things it looks at is code that is run more than once - if you run some piece of code 10.000 times, as you're doing, the JVM most likely decides that it's advantageous to compile it to native code and then it runs that every time the code is executed again.

As far as I know, there is no way you can control when the JVM does this. (edit: might not be true, as mdma points out in his answer - there are some advanced -XX options).

Sun has two versions of the JVM: the client JVM, which is tuned to start up fast but do less aggressive optimizations, and the server JVM, which has a slower startup time but which does more aggressive optimizations.

You could try running your program with the server JVM:

java -server com.mypackage.MyProgram

Note that, as far as I know, only the 64-bit version of Sun Java for Windows includes the server JVM.

Jesper 2010-05-31 14:42:47

Answer 5

+3 A:

There is a bunch of microoptimizations that you can apply to your code. This Article of JavaWorld gives an overview. More ideas can be found on this Page.

I wouldn't expect any large gains from it, though. If you can't optimize your algorytms any more, try to optimize your data structures for easy access and to exploit cache locality.

Durandal 2010-05-31 14:43:29

Most of these optimization links are very out of date. The JavaWorld article is from 1997 - it's almost irrelevant given the present state of java static and dynamic compilation.

mdma 2010-05-31 17:13:58

Yes some are out of date. However, general optimizations like loop invariant motion etc. will never be outdated. Especially since Java relies on the JIT which is more constrainted on how much code analysis is practical compared to static compilers.

Durandal 2010-06-01 11:39:48

Answer 6

A:

You should inspect the bytecode generated by the compiler and check if there are some areas where it isn't able to optimize as well as it should. Otherwise it will be a hit and miss operation.

I do recommend double checking your algorithm before too intensively searching for other optimizations ;)

ponzao 2010-05-31 15:04:43

Answer 7

A:

I know it's not what you're looking for, but I would dispute the methodology of the contest (assuming that's what it is). All major benchmarks (SPEC, etc..) clearly factor in a warmup period (often several minutes) to ensure that the workload itself is being measured, rather than temporary effects like JIT compilation, etc...

Also, note that something that runs in 2.5ms is meaningless in a bench - many JVMs on OSes like Windows don't measure time that well using things like System.currentTimeMillis(). You're going to need to repeat it many times to be meaningful... I would suggest in the 30s to 60s range to become reasonably statistically relevant.

As to the question as asked - you could go down that path, but it's simply not worth it. FYI your biggest wins will be inlining, devirtualization, and escape analysis. All remove significant path length and potentially memory bandwidth challenges.

Trent Gray-Donald 2010-05-31 15:33:22

I'm using System.nanoTime(). Is it better?

Subb 2010-05-31 18:57:35

Yes, nanoTime() is certainly better, but the point remains: 2.5ms is too short. Imagine for a moment you get an ill-timed context switch on the box and preempt the worker thread... your numbers are going to be way out and not statistically stable. Or on a JVM that does async JIT compilation (eg: IBM JVM), you may end up with a race where the JIT sometimes does or does not deliver the perf boost in time...

Trent Gray-Donald 2010-05-31 21:10:58

Answer 8

A:

Well, some tricks you could try are:

1) Strengt reduction, e.g. instead of dividing by 2, you could just use a right shift

int i = 10 / 2;  // division by 2
i     = 10 >> 1; // does the same

2) Use primitive types instead of reference types, e.g. int over Integer

3) Prefer arrays to Collections

4) Use compound statements, e.g. i += 2 instead of i = i + 2

5) Count loops down instead of up, e.g.

for (int i = 10; i >= 0; i--) {
    System.out.println(i);
}

(only useful if you can compare against zero)

6) Use the final keyword on instance variables if possible

Helper Method 2010-05-31 16:13:20

Answer 9

+1 A:

tl;dr: no, most optimizations done by the JVM can't be expressed in Java byte code.

By its very nature Java byte code is defines your code on a very high level.

That is by design and is one of the reason why the JVM can do all that optimizations: byte code describes the operations on a relatively high level and leaves the actual execution details to the JVM.

Due to that fact most optimizations can't be expressed in byte code itself.

For example the JVM spec specifies that every array access that exceeds the bounds of the array must throw a ArrayIndexOutOfBoundsException (see VM Spec 2.5.14).

However, a smart JVM can optimize away many of those checks if it knows that they already happend (for example when iterating over an array, then the JVM can actually execute the bounds-check before the iteration and skip it during each iteration).

This optimization simply can't be expressed with byte code, because the bounds-check is implicit and can't be avoided in the byte-codes representing array access.

Joachim Sauer 2010-05-31 16:19:45

Thanks. That actually answer the original question :)

Subb 2010-05-31 18:53:22

ansaurus

tags:

views:

answers:

Can I replicate some of the optimisations done by the JVM by hand?

related questions