views:

126

answers:

1

Parsing binary sums / products are easy, but I'm having troubles defining a grammar that parses

a + b * c + d + e

as

sum(a, prod(b, c), d, e)

My initial (naive) attempt generated 61 shift / reduce conflicts.

I'm using java cup (but I suppose a solution for any other parser generator would be easily translated).

+1  A: 

The following ANTLR grammar:

parse
  :  exp EOF
  ;

exp
  :  add_exp
  ;

add_exp
  :  mul_exp ('+' mul_exp)* 
  ;

mul_exp
  :  atom ('*' atom)* 
  ;

atom
  :  Number
  |  '(' exp ')'
  ;

Number
  :  'a'..'z'
  ;

parses the input a + b * c + d + e as:

alt text

As you can see, the mul_exp is the furthest away in the tree and (using an appropriate "walk" through your tree) will be evaluated first.

and the input a + b * (c + d) + e is parsed as:

alt text

The images were generated with ANTLRWorks.

EDIT:

A tool like ANTLRWorks makes debugging a grammar a breeze! For example, if I click on the atom rule in the grammar above, the following is automatically generated and displayed at the bottom of the screen:

alt text

Of course, that rule isn't complex at all, but when you do get to work with more complex rules, it's pretty darn easy to visualize them like that.

HTH.

Bart Kiers
Beautiful answer. I can see right away why this works. Thanks!
aioobe
I didn't know about ANTLRWorks.. thanks for the link as well.
aioobe
@aioobe, good to hear that, and you're welcome.
Bart Kiers