views:

115

answers:

2

Im trying to model the EBNF expression

("declare" "namespace" ";")* ("declare" "variable" ";")*

I have built up the yacc (Im using MPPG) grammar, which seems to represent this, but it fails to match my test expression.

The test case i'm trying to match is

declare variable;

The Token stream from the lexer is

KW_Declare
KW_Variable
Separator

The grammar parse says there is a "Shift/Reduce conflict, state 6 on KW_Declare". I have attempted to solve this with "%left PrologHeaderList PrologBodyList", but neither solution works.

Program                     : Prolog;
Prolog                      : PrologHeaderList PrologBodyList;

PrologHeaderList            : /*EMPTY*/
                            | PrologHeaderList PrologHeader;
PrologHeader                : KW_Declare KW_Namespace Separator;

PrologBodyList              : /*EMPTY*/
                            | PrologBodyList PrologBody;
PrologBody                  : KW_Declare KW_Variable Separator;

KW_Declare KW_Namespace KW_Variable Separator are all tokens with values "declare", "naemsapce", "variable", ";".

+3  A: 

It's been a long time since I've used anything yacc-like, but here are a couple of suggestions that may or may not help.

It seems that you need a 2-token lookahead in this situation. The parser gets to the last PrologHeader, and it has to decide whether the next construct is a PrologHeader or a PrologBody, and it can't tell that from the KW_Declare. If there's a directive to increase lookahead in this situation, it will probably solve the problem.

You could also introduce context into your actions: rather than define PrologHeaderList and PrologBodyList, define PrologRuleList and have the actions throw an error if a header appears after a body. Ugly, but sometimes you have to do it: what appears simple in a grammar may not be simple in the generated parser.

A hackish approach might be to combine the tokens: rather than *KW_Declare* and *KW_Variable*, have your lexer recognize the space and use *KW_Declare_Variable*. Since both are keywords, you're not going to run into namespace collision problems.

kdgregory
This is the approach I adopted in the end. I figure I can do the validation in code.
Sprotty
A: 

The grammar at the top is regular so IIRC you can plot it out as a DFA (or a NDA and convert it to a DFA) and then convert the DFA to a grammar. It's bean a while so I'll leave the work as an exercise for the reader.

BCS