Which lexer/parser generator is the best (easiest to use, fastest) for C or C++? I'm using flex and bison right now, but bison only handles LALR(1) grammars. The language I'm parsing doesn't really need unlimited lookahead, but unlimited lookahead would make parsing a lot easier. Should I try Antlr? Coco/R? Elkhound? Something else?
Hello, I have been using the GOLD parsing system (http://www.devincook.com/goldparser) with very good results. My project is small, a parsing system for NC files in C. But I think the tool can handle more complex projects as well.
The bad news is that most real computer langauges aren't "LALR(1)", which means you have to resort to considerable hackery to make YACC parse real langauges.
If you are designing a DSL, you can use any the LALR parser generators without a lot of trouble precisely because you can change them. LL parser generators mostly work here too for the same reason but the lack of left recursion can be a real pain.
If you are uncomprising in the way you like your syntax, GLR parsers are hands-down winners. We use them in the DMS Software Reengineering Toolkit a nd have built production quality parsers for some 30+ languages including C++, which has a folk theorem saying its nearly impossible to parse. The folk theorem was started by people using LL and LALR parsers to try and handle C++. GLR does it easily.
There are a bunch of good answers to this question already in What parser generator do you recommend
I've done several flex/bison systems myself but now I'd replace both with Lemon from sqlite since it's one tool, re-entrant and thread safe as well as having a streaming/pull-based model.
The latest bison claims to do unlimited lookahead, by (in effect) doing several parses simultaneously. If you already have investment in bison then it may be worth trying this out, rather than switching to another package.
http://www.gnu.org/software/bison/manual/bison.html#GLR-Parsers
I have not used this feature myself, though.
As far as other systems go, I have used ANTLR. I did not particularly like it (the documentation was not very good, and one must manually factor one's grammar to cater for operator precedence), but it did work, and so many swear by it that it is certainly worth looking at.
ANTLR makes unlimited lookahead very easy using 'backtrack' option. It might also qualify your 'easiest to use, fastest' criteria since it has ANTLRWORKS that lets you visualize and debug your grammar.
Another advantage is that it makes AST building trivially easy with its built-in support for building ASTs which is missing in bison.
With two books published - 'ANTLR: Definitive guide' and 'Language design patterns', it is one among the very well documented tools available. You also have a very active mailing list.