views:

362

answers:

6

I'm looking for a fail-safe way to round-trip between a JVM class file and a text representation and back again.

One strict requirement is that the resulting round-tripped JVM class file is exactly functionally equivalent to the original JVM class file as long as the text representation is left unchanged.

Furthermore, the text representation must be human-readable and editable. It should be possible to make small changes to the the text representation (such as changing a text string or a class name, etc.) which are reflected in the resulting class file representation.

The simplest solution would be to use a Java decompiler such as JAD to generate the text representation, which in this case would simply be the re-created Java source code. And then use javac to generate the byte-code. However, given the state of the free Java decompilers this approach does not work under all circumstances. It is rather easy to create obfuscated byte-code that does not survive a full round-trip class-file/java-source/class-file (in part because there simply isn't a 1:1 mapping between JVM byte-code and Java source code).

Is there a fail-safe way to achieve JVM class-file/text-representation/class-file round-tripping given the requirements above?

Update: Before answering - save time and effort by reading all the requirements above, and note specifically:

  • "Text-representation of JVM bytecode" does not necessarily mean "Java source-code".
A: 

Maybe I'm being overly simplistic - but have you considered using JAXB (or something along those lines) and just serializing all your classes to xml?

It's not going to work for all situations - but from your question it seems like this would be one feasible approach

dovholuk
I don't think JAXB operates on bytecode.
knorv
Well I suppose I've discovered one problem with stackoverflow... getting downvoted even though the answer supplied was entirely valid per the original post... take a look at the original: http://stackoverflow.com/revisions/776058c1-6337-48b3-9448-891d0efdc2a0/view-source. NOWHERE in there was "JVM Bytecode" referenced. The original question asked for "round-trip between a Java class file and a text representation and back again."also note that given the original post, JAXB - depending on how one writes the class file, would create a "functionally equivalent" object.
dovholuk
"JVM bytecode" equals "Java class file", right? The edit was meant to clarify, not to change the question.
knorv
I think he would have said "source code" and not "text representation" if he meant source code.
erikkallen
A: 

I'm not sure if this relates to your actual requirement, but if you wish to make readable changes in source code, you must have the source code available since most decompilers will have some failing or another (considering Java 6 compatible classes not Java 1.4).

You could take a look at the BeanShell interpreter for Java, since that would allow you to make changes in the source code, without worrying about the byte code decompilation step. You could "evaluate" your source at runtime, after suitable changes have been done in the text representation. It is true that this will come at a performance cost.

Vineet Reynolds
As pointed out in the question "text-representation of bytecode" is not necessarily "Java source-code".
knorv
Well, I didn't realize that since it wasn't very obvious. What is the format/structure of text representation? Is it defined and available for anyone to view?
Vineet Reynolds
The specific format/structure does not matter as long as it is human readable/editable and can be converted into byte-code again.
knorv
@knorv: that is just a very round about way of obtaining, from byte code, the source code (whether it is actual java source, or some other textual representation). the requirement of `human readable/editable` makes it so. if you didnt have that requirement, it'd be an easy problem to solve.
Chii
@Chii: The round-tripping part is the most important part of the requirement. It doesn't matter if the text representations is a bit messy.
knorv
@Chii: "Round-tripping part" is in this context defined as: being able to convert byte-code to text, edit the text and convert from text to byte-code again.
knorv
A: 

Jasmin and Kimera?

Tom Hawtin - tackline
Can Kimera be used to create a .j (Jasmin file) representation of a class file? Please elaborate.
knorv
A: 

No. There exists valid byte-code without a corresponding Java program.

The Soot project has a quite sophisticated decompiler- http://www.sable.mcgill.ca/dava/ - which may be useful for those byte codes coming from a Java compiler. It is, however, not perfect.

Your best bet is still getting the source code for the class files.

Thorbjørn Ravn Andersen
As pointed out in the question "text-representation of bytecode" is not necessarily "Java source-code".
knorv
Then you need a java byte code disassembler/assembler... Java byte code is not much fun handcrafting.
Thorbjørn Ravn Andersen
Feel free to suggest a Java byte code disassembler/assembler which fulfills the stated requirements.
knorv
+1  A: 

This related question might show what you're looking for.

erikkallen
+4  A: 

The BCEL project provides a JasminVisitor which will convert class files into jasmin assembly.

This can be modified and then reassembled into class files. If no edits are made and the versions are kept compatible the the round trip should result in identical class files except that line number mapping may be lost. If you require a bit for bit identical copy for the round trip case you will likely need to alter the tool to take aspects of the code which are pure meta data as well.

jasmin is rather old and is not designed with ease of actually writing full blown programs in assembly but for modifying string constant tables and constants it should be more than adequate.

ShuggyCoUk