views:

1516

answers:

8

I am working on a small text editor project and want to add basic syntax highlighting for a couple of languages (Java, XML..just to name a few). As a learning experience I wanted to add one of the popular or non popular Java lexer parser.

What project do you recommend. Antlr is probably the most well known, but it seems pretty complex and heavy.

Here are the option that I know of.

  1. Antlr
  2. Ragel (yes, it can generate Java source for processing input)
  3. Do it yourself (I guess I could write a simple token parser and highlight the source code).
+6  A: 

ANTLR or JavaCC would be the two I know. I'd recommend ANTLR first.

duffymo
Do you think it is too heavy or complicated. That is the only thing that is holding me back from using Antlr. But it is popular and seems to be very stable.
Berlin Brown
If you're talking about parsing a language like Java, I would say it's just the right thing. There are Java grammars available to you, so it'll just be a matter of walking the AST and generating what you want from it.
duffymo
Know or know of? Recommending one over the other means you ought to have used both, don't you think?
mike g
I've used both. I recommend ANTLR.
duffymo
A: 

I don't think that you need a lexer. all you need is first read the file extention to detect the language and then from a xml file which listed the language keywords easily find them and highlight them.

Pooria
No, I am going to need at least a simple lexer for what I am going to end up doing. Plus, it gives me some flexibility depending on the language.
Berlin Brown
A: 

SableCC

Another interesting option (which I didn't try yet) would be Xtext, which uses Antlr but also includes tools for creating Eclipse editors for your language.

ckarras
A: 

I've done it with JFlex before and was quite satisfied with it. But the language I was highlighting was simple enough that I didn't need a parser generator, so your mileage may vary.

Michael Myers
+1  A: 

ANTLR is the way to go. I would not build it by hand. You'll also find if you look around on the ANTLR web site that grammars are available for Java, XML, etc.

Alex Miller
A: 

JLex and CUP are decent lexer and parser generators, respectively. I'm currently using both to develop a simple scripting language for a project I'm working on.

Pete
+1  A: 

Another option would be Xtext. It will not only generate a parser for your grammar, but also a complete editor with syntax coloring, error markers, content assist and outline view.

Fabian Steeg
+2  A: 

ANTLR may seem complex and heavy but you don't need to use all of the functionality that it includes; it's nicely layered. I'm a big fan of using it to develop parsers. For starters, you can use the excellent ANTLRWorks to visualize and test the grammars that you are creating. It's really nice to be able to watch it capture tokens, build parse trees and step through the process.

For your text editor project, I would check out filter grammars, which might suit your needs nicely. For filter grammars you don't need to specify the entire lexical structure of your language, only the parts that you care about (i.e. need to highlight, color or index) and you can always add in more until you can handle a whole language.

Cameron Pope