views:

321

answers:

3

I'm evaluating using Coco/R vs. ANTLR for use in a C# project as part of what's essentially a scriptable mail-merge functionality. To parse the (simple) scripts, I'll need a parser.

I've focussed on Coco/R and ANTLR because both seem fairly mature and well-maintained and capable of generating decent C# parsers.

Neither seem to be trivial to use either, however, and simplicity is something I'd appreciate - particularly maintainability by others.

Does anyone have any recommendations to make? What are the pros/cons of either for a parsing a small language - or am I looking into the wrong things entirely? How well do these integrate into a typical continuous integration setup? What are the pitfalls?

Related: Well, many questions, such as 1, 2, 3, 4, 5.

+1  A: 

If you're simply merging data into a complicated template, consider Terence Parr's StringTemplate engine. He's the man behind ANTLR. StringTemplate may be better suited and easier to use than a full parser generator. It's a very feature-rich template engine.

There is a C# port available in the downloads.

Corbin March
I saw that - you wouldn't happen to have tried it? I'm a bit leery of using a potentially poorly tested port.
Eamon Nerbonne
@Earnon Nerbonne - I've used it in a proof-of-concept project without any issues but couldn't comment on how well it's tested. Good luck.
Corbin March
This answer may not really have covered my needs - but it's certainly a starting point - and that makes it the best answer to me :-).
Eamon Nerbonne
+1  A: 

Basically, coco/r generates recursive descent parsers and only supports LL(1) grammars whereas ANTLR (AFAIU) produces shift-reduce parsers and can handle more complex grammars. coco/r parsers are much more light-wieght and easier to understand and deploy but sometimes it's a struggle getting the grammar into a form that coco/r understands given its one look-ahead constraint - for many common programming language grammars (e.g. C++, SQL), it's not possible at all.

500 - Internal Server Error
+1  A: 

ANTLR is LL(*), which is as powerful as PEG, though usually much more efficient and flexible. LL(*) degenerates to LL(k) for k>1 one arbitrary lookahead is not necessary.

Terence Parr
Is it possible to avoid the use of a scanner in ANTLR? I'm a little worried about maintainability thereof because the set of viable tokens may depend on the active grammar rules (i.e. kind of like conditional keywords such as 'from' in C#).
Eamon Nerbonne
Sure. Pass in any object that implements TokenStream.
Terence Parr