ansaurus

Question

ANTLR ambiguity in DeCaf - professor unsure where error is

Answer 1

+1 A:

Does it matter if my lexer rules appear before my parser rules in my grammar ...

No, that does not matter.

The problem is that inside your atom rule, ANTLR cannot make a choice between these three variants:

ID ( ...
ID [ ...
ID

without resorting to (possibly) backtracking. You could resolve it by using some syntactic predicates (which looks like: (...)=> ...). A syntactic predicates is nothing more than a "look ahead" and if this "look ahead" is successful, it chooses that particular path.

Your current atom rule can be rewritten as follows:

atom 
  :  OPENPAR expr CLOSEPAR
  |  ID OPENPAR ((expr (COMMA)* )+)? CLOSEPAR 
  |  ID OPENBRACKET expr CLOSEBRACKET
  |  ID
  |  CALLOUT OPENPAR STRING (COMMA (calloutarg)+ COMMA)? CLOSEPAR
  |  constant
  ;

And with the predicates it will look like:

atom 
  :  OPENPAR expr CLOSEPAR
  |  (ID OPENPAR)=>     ID OPENPAR ((expr (COMMA)* )+)? CLOSEPAR 
  |  (ID OPENBRACKET)=> ID OPENBRACKET expr CLOSEBRACKET
  |  ID
  |  CALLOUT OPENPAR STRING (COMMA (calloutarg)+ COMMA)? CLOSEPAR
  |  constant
  ;

which should do the trick.

Note: do not use ANTLRWorks to generate or test the parser! It cannot handle predicates (well). Best do it on the command line.

Also see: https://wincent.com/wiki/ANTLR_predicates

EDIT

Let's label the six different "branches" from your atom rule from A to F:

atom                                                            // branch
  :  OPENPAR expr CLOSEPAR                                      //   A
  |  ID OPENBRACKET expr CLOSEBRACKET                           //   B
  |  ID OPENPAR ((expr COMMA*)+)? CLOSEPAR                      //   C
  |  ID                                                         //   D
  |  CALLOUT OPENPAR STRING (COMMA calloutarg+ COMMA)? CLOSEPAR //   E
  |  constant                                                   //   F
  ;

Now, when the (future) parser should handle input like this:

ID OPENPAR expr CLOSEPAR

ANTLR does not know how the parser should handle it. It could be parsed in two different ways:

branch D followed by branch A
branch C

Which is the source of the ambiguity ANTLR is complaining about. If you were to comment out one of the branches A, C or D, the error would disappear.

Hope that helps.

Bart Kiers 2010-10-20 06:55:52

Thanks for the help! One question I did have, why wouldn't my left-factoring take care of the ID | ID [ | ID ( problem?

Nick 2010-10-20 15:48:41

@Nick, see the **EDIT** to my answer.

Bart Kiers 2010-10-20 18:03:12

ansaurus

tags:

views:

answers:

ANTLR ambiguity in DeCaf - professor unsure where error is

related questions