views:

43

answers:

1

As asked and answered in http://stackoverflow.com/questions/2999755/removing-left-recursion-in-antlr , I could remove the left recursion

E -> E + T|T
T -> T * F|F
F -> INT | ( E )

After left recursion removal, I get the following one

E -> TE'
E' -> null | + TE'
T -> FT'
T' -> null | * FT'

Then, how to make the tree construction with the modified grammar? With the input 1+2, I want to have a tree

^('+' ^(INT 1) ^(INT 2))
. Or similar.

grammar T;

options {
    output=AST;
    language=Python;
    ASTLabelType=CommonTree;
}

start : e -> e
   ;
e  : t ep -> ???
   ;
ep : 
   | '+' t ep -> ???
   ;
t : f tp -> ???
  ;
tp : 
  | '*' f tp -> ???
  ;
f : INT 
  | '(' e ')' -> e
  ;

INT :   '0'..'9'+ ;
WS: (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;
+2  A: 

A bit of opinion: although it's sometimes possible to go from an LR grammar to an LL grammar, as you have done, the result isn't as idiomatic and may seem like a strange way to define your grammar to someone familiar with LL grammars.

For example, consider the following excerpt from above:

tp : 
  | '*' f tp -> ???

The above accepts a * followed by an f whose FIRST set will contain INT or (, the start of itself as its right recursive. Thus, you'll never see the start of the expression you want rooted at *, which will make it much more difficult than it needs to be to build the tree you want.

To make it easy to create that AST in ANTLR, you want to have both the operands and the operator.

add:
   INT '+'^ INT;

The caret, ^ makes the + the root of the tree and the two INTs become its children.

The example Bart K linked to is a great example of how I'd expect to see it done with an LL grammar ... and it scales to support operators of different precedence.

Kaleb Pederson