views:

114

answers:

2

I have syntax like

%(var)

and

%var

and (var)

My rules are something like

optExpr:
    | '%''('CommaLoop')'
    | '%' CommaLoop

CommaLoop:
    val | CommaLoop',' val

Expr:
    MoreRules
    | '(' val ')'

The problem is it doesnt seem to be able to tell if ) belongs to %(CommaLoop) or % (val) but it complains on the ) instead of the (. What the heck? shouldnt it complain on (? and how should i fix the error? i think making %( a token is a good solution but i want to be sure why $( isnt an error before doing this.

A: 

Right now, your explanation and your grammar don't seem to match. In your explanation, you show all three phrases as having 'var', but your grammar shows the ones starting with '%' as allowing a comma-separated list, while the one without allows only a single 'val'.

For the moment, I'll assume all three should allow a comma-separated list. In this case, I'd factor the grammar more like this:

optExpr: '%' aList

aList: CommaLoop
    | parenList

parenList: '(' CommaLoop ')'

CommaLoop: 
    | val 
    | CommaLoop ',' val

Expr: MoreRules
    | parenList

I've changed optExpr and Expr so neither can match an empty sequence -- my guess is you probably didn't intend that to start with. I've fleshed this out enough to run it through byacc; it produces no warnings or errors.

Jerry Coffin
+3  A: 

This is due to the way LR parsing works. LR parsing is effectively bottom-up, grouping together tokens according to the RHS of your grammar rules, and replacing them with the LHS. When the parser 'shifts', it puts a token on the stack, but doesn't actually match a rule yet. Instead, it tracks partially matched rules via the current state. When it gets to a state that corresponds to the end of the rule, it can reduce, popping the symbols for the RHS off the stack and pushing back a single symbol denoting the LHS. So if there are conflicts, they don't show up until the parser gets to the end of some rule and can't decide whether to reduce (or what to reduce).

In your example, after seeing % ( val, that is what will be on the stack (top is at the right side here). When the lookahead is ), it can't decide whether it should pop the val and reduce via the rule CommaLoop: val, or if it should shift the ) so it can then pop 3 things and reduce with the rule Expr: '(' val ')'

I'm assuming here that you have some additional rules such as CommaLoop: Expr, otherwise your grammar doesn't actually match anything and bison/yacc will complain about unused non-terminals.

Chris Dodd