views:

1183

answers:

2

I'm trying to parse a simple grammar using an LALR(1) parser generator (Bison, but the problem is not specific to that tool), and I'm hitting a shift-reduce conflict. The docs and other sources I've found about fixing these tend to say one or more of the following:

  • If the grammar is ambiguous (e.g. if-then-else ambiguity), change the language to fix the ambiguity.
  • If it's an operator precedence issue, specify precedence explicitly.
  • Accept the default resolution and tell the generator not to complain about it.

However, none of these seem to apply to my situation: the grammar is unambiguous so far as I can tell (though of course it's ambiguous with only one character of lookahead), it has only one operator, and the default resolution leads to parse errors on correctly-formed input. Are there any techniques for reworking the definition of a grammar to remove shift-reduce conflicts that don't fall into the above buckets?

For concreteness, here's the grammar in question:

%token LETTER

%%
%start input;
input:          /* empty */ | input input_elt;
input_elt:      rule | statement;
statement:      successor ';';
rule:           LETTER "->" successor ';';
successor:      /* empty */ | successor LETTER;
%%

The intent is to parse semicolon-separated lines of the form "[A-Za-z]+" or "[A-Za-z] -> [A-Za-z]+".

+1  A: 

Thanks for whittling down the grammar and posting it. Changing the successor rule to -

successor:      /* empty */ | LETTER successor;

...worked for me. ITYM the language looked unambiguous.

Matt McNaughton
That's a right-recursive rule - which has implications (primarily on the number of letters that you can find before the stack overflows). It does resolve the shift/reduce conflict, though.
Jonathan Leffler
+1  A: 

Using the Solaris version of yacc, I get:

1: shift/reduce conflict (shift 5, red'n 7) on LETTER
state 1
    $accept :  input_$end
    input :  input_input_elt
    successor : _    (7)

    $end  accept
    LETTER  shift 5
    .  reduce 7

    input_elt  goto 2
    rule  goto 3
    statement  goto 4
    successor  goto 6

So, the trouble is, as it very often is, the empty rule - specifically, the empty successor. It isn't completely clear whether you want to allow a semi-colon as a valid input - at the moment, it is. If you modified the successor rule to:

successor: LETTER | successor LETTER;

the shift/reduce conflict is eliminated.

Jonathan Leffler