tags:

views:

596

answers:

2

I am trying to build an ANTLR grammar that parses tagged sentences such as:

DT The NP cat VB ate DT a NP rat

and have the grammar:

fragment TOKEN  : (('A'..'Z') | ('a'..'z'))+;
fragment WS : (' ' | '\t')+;
WSX : WS;
DTTOK   : ('DT' WS TOKEN);
NPTOK   : ('NP' WS TOKEN);
nounPhrase:  (DTTOK WSX NPTOK);
chunker : nounPhrase {System.out.println("chunk found "+"("+$nounPhrase+")");};

The grammar generator generates the "missing attribute access on rule scope: nounPhrase" in the last line.

[I am still new to ANTLR and although some grammars work it's still trial and error. I also frequently get an "OutOfMemory" error when running grammars as small as this - any help welcome.]

I am using ANTLRWorks 1.3 to generate the code and am running under Java 1.6.

A: 

Answering question after having found a better way...

WS  :  (' '|'\t')+;
TOKEN   : (('A'..'Z') | ('a'..'z'))+;
dttok   : 'DT' WS TOKEN;
nntok   : 'NN' WS TOKEN; 
nounPhrase :    (dttok WS nntok);
chunker :  nounPhrase ;

The problem was I was getting muddled between the lexer and the parser (this is apparently very common). The uppercase items are lexical, the lowercase in the parser. This now seems to work. (NB I have changed NP to NN).

peter.murray.rust
A: 

In the original grammer, why not include the attribute it is asking for, most likely: chunker : nounPhrase {System.out.println("chunk found "+"("+$nounPhrase.text+")");};

Each of your rules (chunker being the one I can spot quickly) have attributes (extra informtion) associated with them. You can find a quick list of the different attributes for the different types of rules at http://www.antlr.org/wiki/display/ANTLR3/Attribute+and+Dynamic+Scopes, would be nice if descriptions were put on the web page for each of those attributes (like for the start and stop attribute for the parser rules refer to tokens from your lexer - which would allow you to get back to your line number and position).

I think your chunker rule should just be changed slightly, instead of $nounPhrase you should use $nounPhrase.text. "text" is a attribute for your "nounPhrase" rule.

You might want to do a little other formating as well, usually the parser rules (start with lowercase letter) appear before the lexer rules (start with uppercase letter)

PS. When I type in the box the chunker rule is starting on a new line but in my original answer it didn't start on a new line.

WayneH
Could you ean elucidate further, please? I am not sure what an attribute is
peter.murray.rust