views:

455

answers:

2

Hey out there,

I want to parse a programming language. I read a lot about formal languages and the Chomsky hierarchy and ANTLR. But I could not find information on how to relate the languages ANTLR v3 as an LL(*) recursive descent parser accepts to the chomsky hierarchy.

How do the Chomsky types mix with LL(*)? Any information (online, books, papers) are greatly appreciated.

(edit:) How do syntactic / semantic predicates and backtracking of ANTLR map into this?

+4  A: 

LL(*) is a subset of context-free languages. However, a different question is what antlr can parse, given predicates and backtracking, which extend its abilities.

Note that if we talk about LL(*), that means ANTLR v3, not 2.

jpalecek
+12  A: 

The Chomsky Hierarchy is basically:

  1. Regular languages
  2. Context-Free Grammars
  3. Context-Sensitive Grammars
  4. Recursively Enumerable (Turing-Complete) Grammars

LL Grammars (and parsers) are a subset of context-free grammars. They are used because regular languages are too weak for programming purposes and because a general context-free parser is O(n^3) which is too slow for parsing a program. Indeed, augmenting a parser with helper functions does make it stronger. The Wikipedia entry on LL parsers explains some of this.The Dragon Book is considered a leading textbook on compilers, and may explain further.

Yuval F