views:

662

answers:

7

Problem: I have a method that compiles to over 8000 bytes of Java bytecode. HotSpot has a magic limit that makes the JIT not kick in for methods that exceed 8000 bytes. (Yes, it is reasonable to have a huge method. This is a tokenizer loop.) The method is in a library and I don't want to require users of the library to have to configure HotSpot to deactivate the magic limit.

Observation: Decompiling the bytecode shows that Eclipse Java Compiler generates a lot of pointless gotos. (javac is even worse.) That is, there are gotos that are only reachable from jumps. Obviously, the jump that jumps to the goto should instead jump directly where the goto jumps and the goto should be eliminated.

Question: Is there a bytecode optimizer for Java 5 class files that flattens pointless jump chains and then removes unnecessary gotos?

Edit: I mean patterns like:

8698:   goto 8548
8701:   goto 0

Obviously, the second goto can only be reached by a jump to 8701 which might as well be a direct jump to 0.

On a second investigation, this questionable pattern is more common:

4257:   if_icmpne 4263
4260:   goto 8704
4263:   aload_0

Where obviously, one would like the compiler to reverse the "not equal" comparison to "equal" comparison, jump to 8704 and eliminate the goto.

A: 

One method compiling to over 8000 bytes? Does anybody understand that code? Is is testable? Try to split it up into multiple (private?) methods with meaningfull names instead of hassling with the optimizer!

OK, maybe there are cases legitimate large methods. But sorry, there are no hints in the question.

Arne Burmeister
Sounds like he's writing a parser -- it is actually very reasonable for a tokenizer loop to get big, and it's actually *more* readable to keep it together. (I've had folks force me to break up such a method just because it was over n lines and the next month they complained because it was harder to follow...)
Scott Stanchfield
Yes, I understand the code. It also maps more closely to spec than any reformulation as recursive descent methods. Yes, it is testable with a large set of unit tests.
hsivonen
+1  A: 

I feel your pain. I had to write a parser once that had around 5kloc of if(str.equals(...)) code. I broke into several methods along the lines of parse1, parse2, etc. If parse1 didn't result in a parsed answer, parse2 was called, etc. This isn't necessarily best-practices, but it does do what you need it to.

KitsuneYMG
btw: in case you didn't happen to know, you've implemented the "Chain of Responsibility" pattern with the parse1, parse2 etc approach (not that that's a bad or good thing for this problem; just wanted to mention it...)
Scott Stanchfield
I used to have multiple methods instead of one huge switch in a loop. However, the huge switch structure fits the spec better and is faster when the JIT does kick in.
hsivonen
Well you could always do a smaller switch() and the default case calls another method with a small switch and so on until all of your larger switch cases are covered. This wouldn't be as nice as one large switch table, but it would work. In my case I couldn't use switch since it doesn't work on Strings.
KitsuneYMG
Yeah, I've considered having two different switch loops and shipping locals between the two. That's ugly though, and I want to keep the overall structure such that it's easy to mechanically compile the whole thing into gotoful C++ with even less switch. (Yes, I have unusual requirements. :-)
hsivonen
A: 

Does it make a difference if you don't compile with debug symbols (i.e. the -g flag in javac)? That might bring the method down below the magic limit.

skaffman
At least in the Eclipse Java Compiler this has no effect on the number of bytecodes in the compiled method. (I assume the debug table is separate.)
hsivonen
A: 

Would it be impossible to refactor the method into submethods? Modern JIT's inline those calls anyway.

Thorbjørn Ravn Andersen
if you've not hand-written a parser before, sometimes the scanner (lexer/tokenizer) can get big, but readability can go way down when you split it up. of course I haven't seen his code, but I've been there...
Scott Stanchfield
I've already split the actual tokenizer actions that can be split out without affecting control flow. The control structure itself is huge.
hsivonen
@scott, I've hand-written a parser before. If it is too complex (for any particular reason) it might be a reasonable time to consider a suitable parser generator.
Thorbjørn Ravn Andersen
@hsivonen, well, I believe it is time you show us some code then :)
Thorbjørn Ravn Andersen
@Thorbjørn, Sure - most of the time I prefer a parser generator; sometimes the parser grammar is very trivial and it's all about lexing, which often is very straightforward but can grow long quickly
Scott Stanchfield
@Thorbjørn Current rev in the middle of a refactoring: http://hsivonen.iki.fi/Tokenizer.javaThere used to be a per code unit read() method. I wanted to inline it manually in order to hoist crucial state variables from fields to locals. As far as I can tell, the only substantial piece eligible for extraction is stuff in CHARACTER_REFERENCE_LOOP
hsivonen
Based on the snippet, I can see why you have problems with this. It is basically a maze of goto's all alike with too many for(;;)'s and fallthroughs for my taste.I would suggest you rewrite so there is no "break foo" or "continue foo" which will most likely require boolean flags, which then in turn goes in the various "for"-loops. When you are there, you should be able to break up the logic easily.
Thorbjørn Ravn Andersen
A: 

If it's a tokenizer loop, would it be better to do it with a data drivven set of mappings and a bit of reflection as appropriate?

So you would store your token matches in a structure that maps them to data about that token's syntax and methods implementing associated functions. Lookup can be optimised over the structure and you avoid the big loop.

That introduces the issue of keeping the data and implementation in sync, but you could generate the data from your codebase with a doclet or possibly annotation.

Without knowing exactly what your big method does, we're limited to trying to optimise it the way that you're assuming is best (and which is apparently not possible anyway).

AndyT
No, ideally you want the state transitions to compile down to jumps in code—not data structure lookups.
hsivonen
Surely that rather depends on whether you can optimise the lookup - both through efficient searching of the match space and through caching of the results. A large, hard coded series of token matches might be quite resistant to optimisation (which I presume is why you're looking for JIT help).In a previous project I used cached method references with quite good results. Changes in the 'source' only needed to invalidate the reference to propmpt a fresh lookup, and normal execution could run at good speed, firing off the method calls without any performance hit from token matching.
AndyT
A: 

does your performance increase if you run a bytecode shrinker/obfuscator on your class? e.g., yguard, proguard, ...

maybe you can write a class file postprocessor using asm because your use case is so specific.

even if you remove all pointless gotos, does that bring you under the magic limit?

Ron
A: 

A list of bytecode libraries mentions BCEL and ASM, that I'd heard of before, along with many others doing various things.

Stephen Denne