views:

76

answers:

1

Hello again, Stackoverflow.

Continuing on my journey into Antlr (Previous questions may provide additional clues on what I'm trying to achieve! Q1 - How do I make a tree parser and Q2 - Solving LL recursion problem) I've hit yet another roadblock I cannot flathom.

Basically (I believe) the expression rule in my grammar needs to either create a new root node depending on the number of datatypes it has matched. I have put together an example to try best describe what I mean:

Given the following input:

ComplexFunction(id="Test" args:[1, 25 + 9 + 8, true, [1,2,3]])

I get this tree:

http://img25.imageshack.us/img25/2273/treeka.png

For reference - The first element in the "args" array as been correctly parsed. Whereas the 2nd element in the array "args" '25 + 9 + 8' has not. It appears to only match the last 2 parts of the expression (9 + 8).

I'm trying to get the 2nd element of the array to be an EXPRESSION node, with the 3 children 25, 9, and 8).

I'm honestly stuck and need your help (Again). Thanks for your time :)

For reference, here is my grammar:

grammar Test;

options {output=AST;ASTLabelType=CommonTree;}
tokens {FUNCTION; NAME; ATTRIBUTES; ATTRIBUTE; VALUE; CHILDREN; EXPRESSION;}

program  : function ;
function :  ID (OPEN_BRACKET (attribute (COMMA? attribute)*)? CLOSE_BRACKET)? (OPEN_BRACE function* CLOSE_BRACE)? SEMICOLON? -> ^(FUNCTION ^(NAME ID) ^(ATTRIBUTES attribute*) ^(CHILDREN function*)) ;

attribute : ID (COLON | EQUALS) expression -> ^(ATTRIBUTE ^(NAME ID) ^(VALUE expression));

expression : datatype (PLUS datatype)* -> datatype ^(EXPRESSION datatype+)?;

datatype : ID  ->  ^(STRING["ID"] ID)
   | NUMBER -> ^(STRING["NUMBER"] NUMBER)
   |  STRING  -> ^(STRING["STRING"] STRING)
   |   BOOLEAN ->  ^(STRING["BOOLEAN"] BOOLEAN)
   |   array -> ^(STRING["ARRAY"] array)
   |   lookup  ->  ^(STRING["LOOKUP"] lookup) ;

array  :  OPEN_BOX (expression (COMMA expression)*)? CLOSE_BOX -> expression* ;

lookup  : OPEN_BRACE (ID (PERIOD ID)*) CLOSE_BRACE -> ID* ;

NUMBER
 : ('+' | '-')? (INTEGER | FLOAT)
 ;

STRING
    :  '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
    ;

BOOLEAN
 : 'true' | 'TRUE' | 'false' | 'FALSE'
 ;

ID  : (LETTER|'_') (LETTER | INTEGER |'_')*
    ;

COMMENT
    :   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
    |   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
    ;

WHITESPACE : (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;} ;

COLON : ':' ;
SEMICOLON : ';' ;

COMMA : ',' ;
PERIOD  :  '.' ;
PLUS : '+' ;
EQUALS : '=' ; 

OPEN_BRACKET : '(' ;
CLOSE_BRACKET : ')' ;

OPEN_BRACE : '{' ; 
CLOSE_BRACE : '}' ;

OPEN_BOX : '[' ;
CLOSE_BOX : ']' ;

fragment
LETTER
 : 'a'..'z' | 'A'..'Z' 
 ;

fragment
INTEGER
 : '0'..'9'+
 ;

fragment
FLOAT
 : INTEGER+ '.' INTEGER*
 ;

fragment
ESC_SEQ
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
    ;
A: 

Haha! I think I got it! If anyone else has a similar problem take a look at my new grammar:

grammar Test;

options {output=AST;ASTLabelType=CommonTree;}
tokens {FUNCTION; ATTRIBUTES; ATTRIBUTE; VALUE; CHILDREN; EXPRESSION;}

@parser::members { int dataTypeCount = 0; }

program     :   function ;
function    :   ID (OPEN_BRACKET (attribute (COMMA? attribute)*)? CLOSE_BRACKET)? (OPEN_BRACE function* CLOSE_BRACE)? SEMICOLON? -> ^(FUNCTION ^(ID["ID"] ID) ^(ATTRIBUTES attribute*) ^(CHILDREN function*)) ;

attribute   :   ID (COLON | EQUALS) expression -> ^(ATTRIBUTE ^(ID["ID"] ID) ^(VALUE expression));

expression  :   datatype {dataTypeCount = 1;} (PLUS datatype {dataTypeCount++;})*   
                -> {dataTypeCount == 1}? datatype*
                -> ^(EXPRESSION datatype*) ;    

datatype    :   ID      ->  ^(STRING["ID"] ID)
            |   NUMBER  ->  ^(STRING["NUMBER"] NUMBER)
            |   STRING  ->  ^(STRING["STRING"] STRING)
            |   BOOLEAN ->  ^(STRING["BOOLEAN"] BOOLEAN)
            |   array   ->  ^(STRING["ARRAY"] array)
            |   lookup  ->  ^(STRING["LOOKUP"] lookup) ;

array       :   OPEN_BOX (expression (COMMA expression)*)? CLOSE_BOX -> expression* ;

lookup      :   OPEN_BRACE (ID (PERIOD ID)*) CLOSE_BRACE -> ID* ;

NUMBER
    :   ('+' | '-')? (INTEGER | FLOAT)
    ;

STRING
    :  '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
    ;

BOOLEAN
    :   'true' | 'TRUE' | 'false' | 'FALSE'
    ;

ID  :   (LETTER|'_') (LETTER | INTEGER |'_')*
    ;

COMMENT
    :   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
    |   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
    ;

WHITESPACE  :   (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;} ;

COLON   :   ':' ;
SEMICOLON   :   ';' ;

COMMA   :   ',' ;
PERIOD  :   '.' ;
PLUS    :   '+' ;
EQUALS  :   '=' ;   

OPEN_BRACKET    :   '(' ;
CLOSE_BRACKET   :   ')' ;

OPEN_BRACE  :   '{' ;   
CLOSE_BRACE :   '}' ;

OPEN_BOX    :   '[' ;
CLOSE_BOX   :   ']' ;

fragment
LETTER
    :   'a'..'z' | 'A'..'Z' 
    ;

fragment
INTEGER
    :   '0'..'9'+
    ;

fragment
FLOAT
    :   INTEGER+ '.' INTEGER*
    ;

fragment
ESC_SEQ
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
    ;
Richie_W