tags:

views:

667

answers:

1

I want to write a translator between two languages, and after some reading on the Internet I've decided to go with ANTLR. I had to learn it from scratch, but besides some trouble with eliminating left recursion everything went fine until now.

However, today some guy told me to check out Happy, a Haskell based parser generator. I have no Haskell knowledge, so I could use some advice, if Happy is indeed better than ANTLR and if it's worth learning it.

Specifically what concerns me is that my translator needs to support macro substitution, which I have no idea yet how to do in ANTLR. Maybe in Happy this is easier to do?

Or if think other parser generators are even better, I'd be glad to hear about them.

+3  A: 

People keep believing that if they just get a parser, they've got it made when building language tools. Thats just wrong. Parsers get you to the foothills of the Himalayas then you need start climbing seriously.

If you want industrial-strength support for building language translators, see our DMS Software Reengineering Toolkit. DMS provides

  • Unicode-based lexers
  • full context-free parsers (left recursion? No problem! Arbitrary lookahead? No problem. Ambiguous grammars? No problem)
  • full front ends for C, C#, COBOL, Java, C++, JavaScript, ... (including full preprocessors for C and C++)
  • automatic construction of ASTs
  • support for building symbol tables with arbitrary scoping rules
  • attribute grammar evaluation, to build analyzers that leverage the tree structure
  • support for control and data flow analysis (as well realization of this for full C, Java and COBOL),
  • source-to-source transformations using the syntax of the source AND the target language
  • AST to source code prettyprinting, to reproduce target language text

Regarding the OP's request to handle macros: our C, COBOL and C++ front ends handle their respective language preprocessing by a) the traditional method of full expansion or b) non-expansion (where practical) to enable post-parsing transformation of the macros themselves. While DMS as a foundation doesn't specifically implement macro processing, it can support the construction and transformation of same.

As an example of a translator built with DMS, see the discussion of converting JOVIAL to C for the B-2 bomber. This is 100% translation for > 1 MSLOC of hard real time code. [It may amuse you to know that we were never allowed to see the actual program being translated (top secret).]. And yes, JOVIAL has a preprocessor, and yes we translated most JOVIAL macros into equivalent C versions.

[Haskell is a cool programming langauge but it doesn't do anything like this. This isn't about language expressibility. Its about figuring out what to build, and spending 100 man-years building it.]

Ira Baxter
@Ira Baxter - it's small world, you are walking distance from me. :o
280Z28
Oops, hit the "up" button on "this is a great comment". You benefit from my hiccup. Find my email address from my user registration page and send me an introductory note; might be some fun conversation in here.
Ira Baxter
This is awesome. However I assume you can't find anything like this in the open source community.
Gabi
Other program transformation systems: TXL is free but I don't think open source. Stratego is probably both. Both have pretty strong parsing technology. Neither directly supports building symbol tables, doing attribute grammars or doing control/data flow analysis. Dunno about Unicode. YMMV.
Ira Baxter