views:

56

answers:

2

I have this grammar of a C# like language, and I want to make a parser for it, but when I put the grammar it tells me about Shift/Reduce conflicts. I tried to fix some but I can't seem to find another way to improve this grammar. Any help would be greatly appreciated :D Here's the grammar:

Program: Decl
       | Program Decl
       ;

Decl: VariableDecl
    | FunctionDecl
    | ClassDecl
    | InterfaceDecl
    ;

VariableDecl: Variable SEMICOLON
            ;

Variable: Type IDENTIFIER
        ;

Type: TOKINT
    | TOKDOUBLE
    | TOKBOOL
            | TOKSTRING
    | IDENTIFIER
    | Type BRACKETS
    ;

FunctionDecl: Type IDENTIFIER OPARENS Formals CPARENS StmtBlock
            | TOKVOID IDENTIFIER OPARENS Formals CPARENS StmtBlock
            ;

Formals: VariablePlus
       | /* epsilon */
       ;

VariablePlus: Variable
            | VariablePlus COMMA Variable
            ;

ClassDecl: TOKCLASS IDENTIFIER OptExtends OptImplements OBRACE ListaField CBRACE
         ;

OptExtends: TOKEXTENDS IDENTIFIER
          | /* epsilon */
          ;

OptImplements: TOKIMPLEMENTS ListaIdent
             | /* epsilon */
             ;

ListaIdent: ListaIdent COMMA IDENTIFIER
          | IDENTIFIER
          ;

ListaField: ListaField Field
          | /* epsilon */
          ;

Field: VariableDecl
     | FunctionDecl
     ;

InterfaceDecl: TOKINTERFACE IDENTIFIER OBRACE ListaProto CBRACE
             ;

ListaProto: ListaProto Prototype
          | /* epsilon */
          ;

Prototype: Type IDENTIFIER OPARENS Formals CPARENS SEMICOLON
         | TOKVOID IDENTIFIER OPARENS Formals CPARENS SEMICOLON
         ;

StmtBlock: OBRACE ListaOptG CBRACE
         ;

ListaOptG: /* epsilon */
         | VariableDecl ListaOptG
         | Stmt ListaOptG
         ;

Stmt: OptExpr SEMICOLON
    | IfStmt
    | WhileStmt
    | ForStmt
    | BreakStmt
    | ReturnStmt
    | PrintStmt
    | StmtBlock
    ;

OptExpr: Expr
       | /* epsilon */
       ;

IfStmt: TOKIF OPARENS Expr CPARENS Stmt OptElse
      ;

OptElse: TOKELSE Stmt
       | /* epsilon */
       ;

WhileStmt: TOKWHILE OPARENS Expr CPARENS Stmt
         ;

ForStmt: TOKFOR OPARENS OptExpr SEMICOLON Expr SEMICOLON OptExpr CPARENS Stmt
       ;

ReturnStmt: TOKRETURN OptExpr SEMICOLON
          ;

BreakStmt: TOKBREAK SEMICOLON
         ;

PrintStmt: TOKPRINT OPARENS ListaExprPlus CPARENS SEMICOLON
         ;

ListaExprPlus: Expr
             | ListaExprPlus COMMA Expr
             ;

Expr: LValue LOCATION Expr
    | Constant
    | LValue
    | TOKTHIS
    | Call
    | OPARENS Expr CPARENS
    | Expr PLUS Expr
    | Expr MINUS Expr
    | Expr TIMES Expr
    | Expr DIVIDED Expr
    | Expr MODULO Expr
    | MINUS Expr
    | Expr LESSTHAN Expr
    | Expr LESSEQUALTHAN Expr
    | Expr GREATERTHAN Expr
    | Expr GREATEREQUALTHAN Expr
    | Expr EQUALS Expr
    | Expr NOTEQUALS Expr
    | Expr AND Expr
    | Expr OR Expr
    | NOT Expr
    | TOKNEW OPARENS IDENTIFIER CPARENS
    | TOKNEWARRAY OPARENS Expr COMMA Type CPARENS
    | TOKREADINTEGER OPARENS CPARENS
    | TOKREADLINE OPARENS CPARENS
    | TOKMALLOC OPARENS Expr CPARENS
    ;

LValue: IDENTIFIER
      | Expr PERIOD IDENTIFIER
      | Expr OBRACKET Expr CBRACKET
      ;

Call: IDENTIFIER OPARENS Actuals CPARENS
    | Expr PERIOD IDENTIFIER OPARENS Actuals CPARENS
    | Expr PERIOD LibCall OPARENS Actuals CPARENS
    ;

LibCall: TOKGETBYTE OPARENS Expr CPARENS
       | TOKSETBYTE OPARENS Expr COMMA Expr CPARENS
       ;

Actuals: ListaExprPlus
       | /* epsilon */
       ;

Constant: INTCONSTANT
        | DOUBLECONSTANT
        | BOOLCONSTANT
        | STRINGCONSTANT
        | TOKNULL
        ;
+2  A: 

conflicts (shift/reduce or reduce/reduce) mean that your grammar is not LALR(1) so can't be handled by bison directly without help. There are a number of immediately obvious problems:

  • expression ambiguity -- there's no precedence in the grammar, so things like a + b * c are ambiguous. You can fix this by adding precedence rules, or by splitting the Expr rule into separate AdditiveExpr, MultiplicativeExpr, ConditionalExpr etc rules.

  • dangling else ambiguity -- if (a) if (b) x; else y; -- the else could be matched with either if. You can either ignore this if the default shift is correct (it usually is for this specific case, but ignoring errors is always dangerous) or split the Stmt rule

There are many books on grammars and parsing that will help with this.

Chris Dodd
+1  A: 

The old Bison version on my school's server says you have 241 shift/reduce conflicts. One is the dangling if/else statement. Putting "OptElse" does NOT solve it. You should just write out the IfStmt and an IfElseStmt and then use %nonassoc and %prec options in bison to fix it.

Your expressions are the issue of almost all of the other 240 conflicts. What you need to do is either force precedence rules (messy and a terrible idea) or break your arithmetic expressions into stuff like:

AddSubtractExpr: AddSubtractExpr PLUS MultDivExpr | ....
               ;


MultDivExpr: MultiDivExpr TIMES Factor | ....
           ;


Factor: Variable | LPAREN Expr RPAREN | call | ...
      ;

Since Bison produces a bottom up parser, something like this will give you correct order of operations. If you have a copy of the first edition of the Dragon Book, you should look at the grammar in Appendix A. I believe the 2nd edition also has similar rules for simple expressions.

Kizaru