views:

457

answers:

7

I am looking for a parser generator for Java that does the following: My language project is pretty simple and only contains a small set of tokens.

Output in pure READABLE Java code so that I can modify it (this why I wouldn't use ANTLR) Mature library, that will run and work with at least Java 1.4

I have looked at the following and they might work: JavaCC, jlex, Ragel?

A: 

For a language that simple, JFlex might suffice. It's similar to JLex but faster (which might also mean less readable, but I've not seen JLex's output).

It is a lexer, not a parser, but it is built to interface easily with CUP or BYacc/J. And again, for a simple language, it might be easier to just write your own parser (I've done this before).

Michael Myers
+2  A: 

Maybe you're looking for parser combinators instead of parser generators? See this paper and JParsec.

It's a really bad idea to edit generated parser code--it's a lot easier to edit the grammar file and then recompile it. Unless you're doing it for educational purposes, in which case ANTLR prides itself in generating pretty readable code for such a powerful parser generator.

Martijn
Having used JParsec, I don't think I'd ever want to go back to editing grammar files. Even without the more advanced features, it's worth being able to use your existing tools.
jamesh
A: 

We are using JavaCC for our (as well rather small language) and are happy with it.

jhwist
+1  A: 

You should use Rats... This way, you don't have to separate lexer and parser and then if you want to extend your project that will be trivial. It's in java and then you can process your AST in Java...

LB
+1  A: 

I had good experience SableCC.

It works different from most generators, in that you're given a AST/Visitor model that you extend (via inheritance).

I can't comment on the "quality" of its code in terms of readability (it's been a while since I've used it), but it does have the quality that you don't have to read the code at all. Just the code in your subclass.

Will Hartung
+1  A: 

Maybe ANTLR will do it for you. It's a nice parser generator with a fine book available for documentation.

duffymo
+1  A: 

Take a look at SableCC. Sablecc is an easy to use parser generator that accepts the grammar of your language as EBNF, without intermingling action code, and generates a Java parser that produces a syntax tree which can be traversed using a tree node visitor. SableCC is powerful, yet much simpler to use than ANTLR, JavaCC, yacc, etc. It also does not require a separate lexer. Constructing your language processor amounts to extending a visitor class generated from your grammar, and to overriding its methods which are called upon when a syntactic construct is encountered by the parser. For every grammar rule XYZ, the visitor will have a method inAXYZ(Node xyz)....outAXYZ(Node xyz) called upon when the parser matches the rule.

Hans