tags:

views:

294

answers:

5

While examining aspell to figure out how to write my own spell checker in Java, I wondered how much work it would be to convert aspell to some dialect of C which is close enough to Java that it's possible to compile both a C and a Java version from the same source.

Of course, there is UML which promises that you can "model" your problem "once" and then "generate code for any language" but that a) usually doesn't include algorithms (only dependencies and relations) and b) the resulting code is ... well ... "obviously written by a computer" ahem

So I was wondering: Does anyone know or has anyone used a "meta language" which allows to be compiled in C++ and Java from the same code base? What did you use? Can you have a set of rules which allow to write a simple parser which can fix the final gap to make plain C++ the code compile with Java? Or would you go the other way around?

Definition "transpiler": A program which reads code in language A and converts it into code for language B. The difference between a compiler and a transpiler is that the compiler usually converts from a high to a low level language (C -> Assembler, Java -> Bytecode) while the transpiler converts between languages at (roughly) the same level. Examples: (C++ -> C, Pascal -> C). Think StarTrek(TM) translators.

+3  A: 

There have been compilers that compile to other high-level languages for a long time. The one I used the most was f2c (I had Fortran code I needed to convert to C so I could make a shared library to be called from a Lisp program). One thing is true of all of them: you don't want to fiddle with the generated code. It's not generally designed to be human-readable.

If you had a "meta language" that could compile into either C++ or Java, you'd get lousy C++ and lousy Java. While the languages share something of a common subset, and some features that can be more or less mapped onto each other, many of the more advanced features work in greatly different ways, and the idioms are very different.

You might find a C-to-Java translator, in which case you'd maintain the C code and compile into Java. You could port the code yourself. Depending on what you're doing, you might get away with compiling the C separately and using JNI to access it from Java. You are extremely unlikely to find a compiler that will take C and compile it to idiomatic Java.

David Thornley
I'd be happy with something that looks a lot like C or Java and can produce C and Java.
Aaron Digulla
A: 

Many of these tools are *-to-C compilers. These are useful because basically every platform already has a C compiler, and C is easy to emit in a compiler. The well-known example is Cfront, the C++-to-C compiler.

Note that these tools are true compilers; they parse and compile the entire source; it's just the output phase in which they don't emit assembly but C instead.

MSalters
+2  A: 

I think what you are looking for is a source-to-source compiler. While most of these emit C code, perhaps you could tailor them to emit some really limited subset of C so that you could write something simple to translate that to Java.

The tools that immediately come to mind for doing this kind of compilation are:

Both of these have front-ends for a number of languages including C/C++ as well as binary and bytecode formats. They also both have C backends. Both do all the parsing and generate a full AST from whatever they read in, so maybe you can get the transformation you want from playing with the AST. Sadly, I'm hard-pressed to think of a compiler that has a Java source code backend.

tgamblin
A: 

Hmm, your definition of transpiler does not differ of the one of a 'compiler' ;-) "translating from language A to language B".

Basically, high level languages differ because you program to a different 'machine model' abstraction. For example Java is slightly higher level due to it's virtual machine model which is more 'virtual' (managed code...). They encourage different programming styles and have different purposes.

Of course you can add a librarie to C++ for garbage collection and other things and then translate Java to that, but then you do not really have idiomatic C++. What GCC (and GCJ) does, is translating Java+libgcj and c++ directly into an intermediate representation (so that you don't loose too much efficiency by having a c or c++ intermediate step). Often the intermediate step can be C language, used as a portable assembler (but it has disadvantages, see C-- discussions).

As you said generated code is not what you want, tools like ROSE will help, but come on, you won't do template metaprogramming with that ;-) In fact it will give you yet another abstract machine to program to, which will roughly be a subset of languages you want to generate code for. There are other tools that make you program to models in different languages, for example in the field of component programming (like Fractal).

There are also compilers from higher level languages to higher level languages. Source-to-source compilers compile to high level languages: often from language A to language A, they are used mainly for optimization. Others compilers generate code in high level languages in domain specific cases: e.g. pyjamas generates javascript from python code, Brook generates C++ and gpu shaders code from Brook (streaming) language....

But none of those is something you want, programming languages are different, and the only way people have found to unify them, is by compiling them to a common machine model, that's the idea behind Microsoft's CLR, even more than behind the JVM, because the CLR is really broad: you can do unmanaged code, i.e. compile C++ to it (efficiently... not by considering the memory as an array of bytes...). LLVM is quite similiar, but the intermediate representation is not target agnostic.

Conclusion: One Virtual Machine to rule them all...

Piotr Lesnicki
+1  A: 

Probably not what you wanted, but NestedVM lets you take C/C++/Fortran (or anything else GCC will compile) and convert it to Java.
It actually does this by compiling your code to a MIPS target, and then running the resulting MIPS on a VM written in Java.

Hasturkun
This is not exactly what I was looking for but it's much more close than anything else. I really like the idea. Let's see if I can make it happen with aspell and what speed I'll get!
Aaron Digulla