views:

734

answers:

2

Hey. I'm new to ANTLR. ANTLRWorks wizard wrrited for me the following code:

grammar test;

ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
    ;

INT :   '0'..'9'+
    ;

FLOAT
    :   ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
    |   '.' ('0'..'9')+ EXPONENT?
    |   ('0'..'9')+ EXPONENT
    ;

COMMENT
    :   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
    |   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
    ;

WS  :   ( ' '
        | '\t'
        | '\r'
        | '\n'
        ) {$channel=HIDDEN;}
    ;

STRING
    :  '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
    ;
CHAR:  '\'' ( ESC_SEQ | ~('\''|'\\') ) '\''
    ;

fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;

fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;

fragment
ESC_SEQ
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
    |   UNICODE_ESC
    |   OCTAL_ESC
    ;

fragment
OCTAL_ESC
    :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7')
    ;

fragment
UNICODE_ESC
    :   '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
    ;

When debugging it, it throws the following error:

[22:45:49] error(100): C:\Documents and Settings\user\Desktop\test.g:0:0: syntax error: codegen: <AST>:0:0: unexpected end of subtree

Can someone explain me what is the error, where is it and how can I fix it?

Thanks.

+1  A: 

Disclaimer: I don't know anything about ANTLR wizard.

A google search turns up this quote:

Usually "unexpected end of subtree" means you forgot to make something a root in the parser.

This makes sense to me IF your file is supposed to specify a grammar and not just rules for lexical analysis. The first line of your file is "grammar test" so presumably this is a grammar.

A grammar lets you reduce a series of terminal symbols to a single nonterminal symbol. So for example, a very simple grammar representing fully parenthesized expressions would look like this:

P : E
E : (X)
  | E E
  | (E)
X : 'x'

Here, P is the root because all sentences eventually reduce to a P. If a sentence cannot reduce to a P, it does not match this grammar. So, you need to find a root for your grammar, and all other productions should come up only in the context of the root (ie via a direct or indirect derivation).

danben
Note that since ANTLR produces LL(*) parsers, it can't cope with the left-recursive grammar you posted. http://www.antlr.org/wiki/display/ANTLR3/Left-Recursion+Removal
Bart Kiers
Ah, ok - with ANTLR ending in "LR" I just assumed.
danben
:) true, the name suggests otherwise. ANTLR stands for "ANother Tool for **L**anguage **R**ecognition".
Bart Kiers
+4  A: 

In ANTLR, every rule that starts with an uppercase is a lexer rule. The ones that start with a lower case are parser rules. As you can see, you only have lexer rules: and there's your problem. You have to have at least one parser rule. If you add the following rule:

parse
  :  ID
  |  INT
  |  // ...
  ;

the error will disappear when generating source files for your lexer/parser.

Bart Kiers