ansaurus

Question

Answer 1

+4 A:

The problem you're describing is an issue with creating LR(0) parsers - that is, bottom-up parsers that don't do any lookahead to symbols beyond the current one they are parsing. The grammar you've described doesn't appear to be an LR(0) grammar, which is why you run into trouble when trying to parse it w/o lookahead. It does appear to be LR(1), however, so by looking 1 symbol ahead in the input you could easily determine whether to shift or reduce. In this case, an LR(1) parser would look ahead when it had the 1 on the stack, see that the next symbol is a +, and realize that it shouldn't reduce past A (since that is the only thing it could reduce to that would still match a rule with + in the second position).

An interesting property of LR grammars is that for any grammar which is LR(k) for k>1, it is possible to construct an LR(1) grammar which is equivalent. However, the same does not extend all the way down to LR(0) - there are many grammars which cannot be converted to LR(0).

See here for more details on LR(k)-ness:

http://en.wikipedia.org/wiki/LR_parser

Amber 2010-04-13 03:50:24

If I parse 1+2*3, the stack ends up at A+M at one point, by my understanding. That could be reduced to A, but that would be incorrect here, as it would yield A*..., for which no rule exists. Does looking ahead by 1 symbol indicate that this reduction should not occur as well? I added more detail on this to the original post.

Joey Adams 2010-04-13 04:20:23

Yes, it does - because when you have `A+M` on the stack, and you look ahead to `*`, you see that you *must* have an `M` to the left of the `*`, so you know not to reduce if that would result in the top of the stack not being `M`.

Amber 2010-04-13 04:24:01

Answer 2

+1 A:

I'm not exactly sure of the Yacc / Bison parsing algorithm and when it prefers shifting over reducing, however I know that Bison supports LR(1) parsing which means it has a lookahead token. This means that tokens aren't passed to the stack immediately. Rather they wait until no more reductions can happen. Then, if shifting the next token makes sense it applies that operation.

First of all, in your case, if you're evaluating 1 + 2, it will shift 1. It will reduce that token to an A because the '+' lookahead token indicates that its the only valid course. Since there are no more reductions, it will shift the '+' token onto the stack and hold 2 as the lookahead. It will shift the 2 and reduce to an M since A + M produces an A and the expression is complete.

aduric 2010-04-13 04:11:51

ansaurus

tags:

views:

answers:

Shift-reduce: when to stop reducing?

related questions