views:

148

answers:

5

Hello, I was wondering which is the best way to extend Java syntax and include other things.

I mean something like Groovy or other langagues based upon Java, that keep backward compatibility

  1. The most efficient way should be to actually generate .class files without having to interpret them but letting the JVM do the dirty work. How can I achieve it?
  2. Is there a way to let a partial compiler cooperate with the java one? Otherwise any kind of extension will need a compiler able to compile normal Java and also any other kind of synctatic sugar that I want to add.
  3. A cool thing would be to let the java compiler generate the code for normal Java syntax and include parts that are compiled indipendently (by generating opcodes) to replace new syntax things. For example, think about a new data type that should be a base one: it can be backened by a class and then used it in compiled bytecode, other parts of the code are compiled normally like Java would do.

What kind of tools are available for this purpose?

EDIT: I'm quite confident with compilers and VMs, I already wrote a bunch of them.. so I don't want an easy solution but the most efficient/functional one.. not limited to extending simple capabilities but extending as much as I want starting from a layer well thought to work out these things.

+4  A: 

Compilers work by parsing input files according to a grammar. So, if you want to use the Java compiler to handle your extensions, you need to update its grammar. Fortunately, JDK 1.6 is open-source, so you can do this: http://download.java.net/openjdk/jdk6/

I will warn you, however, unless you're familiar with grammars and code generation, it's going to be a long process ...

kdgregory
+4  A: 

It's not strictly what you want, but you could do worse than look at designing a DSL in Scala to do what you want. Because it's Scala it'll run nicely alongside your existing Java code in the JVM.

Brian Agnew
+1 also http://www.ibm.com/developerworks/java/library/j-scala10248.html
skaffman
Ah. That looks like a very useful link
Brian Agnew
+6  A: 

Without giving this much thought, I'd consider writing a preprocessor for the Java language. You could feed it Java source that it would pass through untouched, and you could design it to interpret some elements of your choosing in whatever way you think would be appropriate.

Just for grins, you could use the C preprocessor to define variables, constants and even small functions into Java code... just to get a feel for what's involved.

Of course this also means that you'll have to find syntactically/semantically valid Java equivalents for whatever constructs you plan to add.

The good thing is, you'd still be letting the Java compiler do the heavy lifting, you'd just be messing with the source a bit. This may sound like a cowardly thing to do, but please consider that early C++ implementations were nothing more than a (slightly more sophisticated) preprocessor stuck in front of the C compiler.


UPDATE

Just for completeness' sake, the "formally correct" way to do this kind of thing is to write a new compiler. This breaks down into a so-called "front end" which analyzes your syntax and generates a parse tree from it, i.e. puts the code into a machine-manageable pre-digested form. Then there's a back end, which generates some kind of machine code (possibly JVM bytecode) from the preprocessed program.

For front ends you want a lexer and a parser for turning text into tokens and then building symbol tables, creating the parse tree and whatnot. Normally, you don't directly write these yourself: You write yourself up a complete grammar in a Backus-Naur-style form, and a lexer/parser generator reads this specification and can produce Java code to parse that language. One of the most popular front ends these days is Antlr. Google for it!

I'm not very knowledgeable on back ends, except to say that's an adventure I'd rather avoid. As far as I know, writing a compiler back end is hard. Hence my recommendation with the preprocessor.

Carl Smotricz
+1, IMO the easiest way to get started
ammoQ
Much easier than changing the compiler.
PiPeep
+1 nice answer Carl!
Pascal Thivent
+1 for possibly the simplest starting point
Brian Agnew
A: 

I may be missing something but why don't you look at what the Groovy Compiler is doing (to convert a Groovy File into .class File). I didn't check but I'm pretty sure it involves the Antlr lexer/parser.

Pascal Thivent
+4  A: 

You can take a look at Project Lombok. It uses annotations to slightly change java. In Java 6 you can change the way the compiler works.

For instance (automatic getter and setter):

public class GetterSetterExample {
  private int age = 10;
  private String name;

  @Override public String toString() {
    return String.format("%s (age: %d)", name, age);
  }

  public int getAge() {
    return age;
  }

  public void setAge(int age) {
    this.age = age;
  }

  protected void setName(String name) {
    this.name = name;
  }
}

becomes

import lombok.AccessLevel;
import lombok.Getter;
import lombok.Setter;

public class GetterSetterExample {
    @Getter @Setter private int age = 10;
    @Setter(AccessLevel.PROTECTED) private String name;

    @Override public String toString() {
        return String.format("%s (age: %d)", name, age);
    }
}

and (using automatic cleanup):

import java.io.*;

public class CleanupExample {
  public static void main(String[] args) throws IOException {
    InputStream in = new FileInputStream(args[0]);
    try {
      OutputStream out = new FileOutputStream(args[1]);
      try {
        byte[] b = new byte[10000];
        while (true) {
          int r = in.read(b);
          if (r == -1) break;
          out.write(b, 0, r);
        }
      } finally {
        out.close();
      }
    } finally {
      in.close();
    }
  }
}

becomes

import lombok.Cleanup;
import java.io.*;

public class CleanupExample {
  public static void main(String[] args) throws IOException {
    @Cleanup InputStream in = new FileInputStream(args[0]);
    @Cleanup OutputStream out = new FileOutputStream(args[1]);
    byte[] b = new byte[10000];
    while (true) {
      int r = in.read(b);
      if (r == -1) break;
      out.write(b, 0, r);
    }
  }
}

There is source available, so you can see how this magic works

Marcelo Morales