views:

94

answers:

3

From what I have read java (usually) seems to compile java to not very (is at all?) optimised java bytecode, leaving it to the jit to optimise. Is this true? And if it is has there been any exploration (possibly in alternative implementations) of getting the compiler to optimise the code so the jit has less work to do (is this possible)?

Also many people seem to have a dislike for native code generation (sometimes referred to as ahead of time compilation) for Java (and many other high level memory managed languages) , for many reasons such as loss of portability (and ect.) , but also partially because (at least for those languages that have a just in time compiler) the thinking goes that ahead of time compilation to machine code will miss the possible optimisations that can be done by a jit compiler and therefore may be slower in the long run.

This leads me to wonder whether anyone has ever tried to implement http://en.wikipedia.org/wiki/Profile-guided_optimization (compiling to a binary + some extras then running the program and analysing the runtime information of the test run to generate a hopefully more optimised binary for real world usage) for java/(other memory managed languages) and how this would compare to jit code? Anyone have a clue?

+4  A: 

Personally, I think the big difference is not between JIT compiling and AOT compiling, but between class-compilation and whole-program optimization.

When you run javac, it only looks at a single .java file, compiling it into a single .class file. All the interface implementations and virtual methods and overrides are checked for validity but left unresolved (because it's impossible to know the true method invocation targets without analyzing the whole program).

The JVM uses "runtime loading and linking" to assemble all of your classes into a coherent program (and any class in your program can invoke specialized behavior to change the default loading/linking behavior).

But then, at runtime, the JVM can remove the vast majority of virtual methods. It can inline all of your getters and setters, turning them into raw fields. And when those raw fields are inlined, it can perform constant-propagation to further optimize the code. (At runtime, there's no such thing as a private field.) And if there's only one thread running, the JVM can eliminate all synchronization primitives.

To make a long story short, there are a lot of optimizations that aren't possible without analyzing the whole program, and the best time for doing whole program analysis is at runtime.

benjismith
Agreed. Note that some AOT-compilers (gcc is the only one I know of, but there may be others) do have this feature and call it "link-time optimization". It's relatively new, though, and it's still tied to guessing which function will be called most often. Though there's "profile-guided optimization" for that matter, which the OP mentioned.
delnan
+1  A: 

The official Java Hot Spot compiler does "adaptive optimisation" at runtime, which is essentially the same as the profile-guided optimisation you mentioned. This has been a feature of at least this particular Java implementation for a long time.

The trade-off to performing more static analysis or optimisation passes up-front at compile time is essentially the (ever-diminishing) returns you get from this extra effort against the time it takes for the compiler to run. A compiler like MLton (for Standard ML) is a whole-program optimising compiler with a lot of static checks. It produces very good code, but becomes very, very slow on medium-to-large programs, even on a fast system.

So the Java approach seems to be to use JIT and adaptive optimisation as much as possible, with the initial compilation pass just producing an acceptable valid binary. The absolute opposite end is to use an approach like that of something like MLKit, which does a lot of static inference of regions and memory behaviour.

Gian
+1  A: 

Profile-guided optimization has some caveats, one of them mentioned even in the Wiki article you linked. It's results are valid

  • for the given samples, representing how your code is actually used by the user or other code.
  • for the given platform (CPU, memory + other hardware, OS, whatever).
    From the performance point of view there are quite big differences even among platforms that are usually considered (more or less) the same (e.g. compare a single core, old Athlon with 512M with a 6 core Intel with 8G, running on Linux, but with very different kernel versions).
  • for the given JVM and its config.

If any of these change then your profiling results (and the optimizations based on them) are not necessary valid any more. Most likely some of the optimizations will still have a beneficial effect, but some of them may turn out suboptimal (or even degrading performance).

As it was mentioned the JIT JVMs do something very similar to profiling, but they do it on the fly. It's also called 'hotspot', because it constantly monitors the executed code, looks for hot spots that are executed frequently and will try to optimize only those parts. At this point it will be able to exploit more knowledge about the code (knowing the context of it, how it is used by other classes, etc.) so - as mentioned by you and the other answers - it can do better optimizations as a static one. It will continue monitoring and if its needed it will do another turn of optimization later, this time trying even harder (looking for more, more expensive optimizations).
Working on the real life data (usage statistics + platform + config) it can avoid the caveats mentioned before.

The price of it is some additional time it needs to spend on "profiling" + JIT-ing. Most of the time its spent quite well.

I guess a profile-guided optimizer could still compete with it (or even beat it), but only in some special cases, if you can avoid the caveats:

  • you are quite sure that your samples represent the real life scenario well and they won't change too much during execution.
  • you know your target platform quite precisely and can do the profiling on it.
  • and of course you know/control the JVM and its config.

It will happen rarely and I guess in general JIT will give you better results, but I have no evidence for it.

Another possibility for getting value from the profile-guided optimization if you target a JVM that can't do JIT optimization (I think most small devices have such a JVM).

BTW one disadvantage mentioned in other answers would be quite easy to avoid: if static/profile guided optimization is slow (which is probably the case) then do it only for releases (or RCs going to testers) or during nightly builds (where time does not matter so much).
I think the much bigger problem would be to have good sample test cases. Creating and maintaining them is usually not easy and takes a lot of time. Especially if you want to be able to execute them automatically, which would be quite essential in this case.

Sandor Murakozi