views:

229

answers:

3

Hello,

I am doing a compilers discipline at college and we must generate code for our invented language to any platform we want to. I think the simplest case is generating code for the Java JVM or .NET CLR. Any suggestion which one to choose, and which APIs out there can help me on this task? I already have all the semantic analysis done, just need to generate code for a given program.

Thank you

A: 

In .NET you can use the Reflection.Emit Namespace to generate MSIL code.

See the msdn link: http://msdn.microsoft.com/en-us/library/3y322t50.aspx

scptre
+1  A: 

Another option that i came across what a library called run sharp that can generate the MSIL code in runtime using emit. But in a nicer more user friendly way that is more like c#. The latest version of the library can be found here. http://code.google.com/p/runsharp/

scptre
+3  A: 

From what I know, on higher level, two VMs are actually quite similar: both are classic stack-based machines, with largely high-level operations (e.g. virtual method dispatch is an opcode). That said, CLR lets you get down to the metal if you want, as it has raw data pointers with arithmetic, raw function pointers, unions etc. It also has proper tailcalls. So, if the implementation of language needs any of the above (e.g. Scheme spec mandates tailcalls), or if it is significantly advantaged by having those features, then you would probably want to go the CLR way.

The other advantage there is that you get a stock API to emit bytecode there - System.Reflection.Emit - even though it is somewhat limited for full-fledged compiler scenarios, it is still generally enough for a simple compiler.

With JVM, two main advantages you get are better portability, and the fact that bytecode itself is arguably simpler (because of less features).

Pavel Minaev
Nice, maybe I'll stay with Java since the language is very simple, it only has a few basic operations. Any sugestions on an API on Java that can help me or I'll have to learn the inners of bytecode to generate it 'manually'?
Pedro
Apache BCEL (http://jakarta.apache.org/bcel/) seems to be a popular bytecode manipulation library. ASM (http://asm.ow2.org/) also looks interesting.
Pavel Minaev
On the other hand, note that you do not need to use the full set of CLR operations, either, and the basic ones are mostly equivalent to those in JVM. Also, CLR has a standardized textual syntax to represent its MSIL bytecode (http://en.wikipedia.org/wiki/MSIL), so one other option is to output MSIL, and then have `ilasm` compile that; this may be somewhat easier to debug compared to outputting binary bytecode directly.
Pavel Minaev
Thanks, Pavel, I'll try Apache BCEL, seems to fit my needs. I always see your constructive comments at Eric Lippert's blog. Keep it up the good work :)
Pedro