tags:

views:

109

answers:

3

I have strings like this:

`(val1, val2, val3)`

And I have ANTLR grammar to parse this code:

grammar TEST;

tokens {
 ORB = '(';
 CRB = ')';
 COMA = ',';
}

@members{

}
/*Parser rule*/
mainRule 
    :    ORB WORD (COMA WORD)* CRB;

/*Lexer rule*/

WORD    :    ('a'..'z'|'A'..'Z'|'0'..'9')+;

WS      :   ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ { $channel = HIDDEN; };

Now I need to map all WORDs into Java. How can I bind value when target token embraced into brackets?

Many thanks!

A: 

You can get the textual value of a matched token with the .text property. Like this:

 string s = $WORD.text

Depending on what your overall grammar is supposed to do, it may be proper to add the $WORD.text string to an internal list, pass it to some other function, or to turn it into the return value from mainRule. For example, if you want mainRule to give you back the list of parsed strings, you can write the following:

mainRule returns [List strings] @init { $strings = new Vector(); }:
    ORB 
    WORD { $strings.add($WORD.text); } 
    ( COMMA WORD { $strings.add($WORD.text); } )*
    CRB
    ;
JSBangs
Not sure if you forgot the `@` before `init`, or if it's ANTLR v2 syntax. In ANTLR v3, you should put an `@` in front of it.
Bart Kiers
Thanks!I don't know about @init keyword. Your answer is very helpful for me.P.S. You have some errors in your example, this is correct code: mainRule2 returns [List strings] @init { $strings = new Vector(); }: ORB val0=WORD { $strings.add($val0.text); } ( COMA val1=WORD { $strings.add($val1.text); } )* CRB;
glebreutov
A: 

Sorry, could you elaborate a bit on what you are trying to do? As I get it you want to bind each word to a java variable, correct?

words+=WORD (COMA words+=WORD)* {$words}

Here you define a label words (which is actually a list) and add every occurence of WORD to that label using the += syntax. You can then refer to this label as shown $words

Have a look at the ANTLR-Documentation and look for labels. If you want to do something sophisticated with your parser I recommend Terrence Parr's book on ANTLR. It has a very good introducory chapter to the general topic of parsing and is the best reference for ANTLR.

HTH

er4z0r
Thanks! This is exactly what I mean! Sorry for poorly formulated question, english is not my native language
glebreutov
You're welcome. If the answer helps you, don't forget to mark it.I don't think SO-flair points are traded on ebay (yet) ;-)
er4z0r
+1  A: 

Pretty much the same as JS Bangs' answer, only here's a complete SSCCE you can compile and run and I showed how you can "label" your tokens and access them to put them in the List the mainRule is returning. Also note that the init needs an @ sign in front of it (at least ANTLR v3 expects it).

grammar Test;

@parser::members {
  public static void main(String[] args) throws Exception {
    String text = "(a, bb ,  ccc )";
    ANTLRStringStream in = new ANTLRStringStream(text);
    TestLexer lexer = new TestLexer(in);
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    System.out.println(new TestParser(tokens).mainRule());
  }
}

mainRule returns [List<String> words]
@init{$words = new ArrayList<String>();}
  :  '(' w=WORD {$words.add($w.text);} (',' w=WORD {$words.add($w.text);} )* ')'
  ;

WORD
  :  ('a'..'z'|'A'..'Z'|'0'..'9')+
  ;

WS
  :  ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ { $channel = HIDDEN; }
  ;

And then:

bart@hades:~/Temp$ java -cp antlr-3.2.jar org.antlr.Tool Test.g
bart@hades:~/Temp$ javac -cp antlr-3.2.jar *.java
bart@hades:~/Temp$ java -cp .:antlr-3.2.jar TestParser
[a, bb, ccc]
bart@hades:~/Temp$ 

On Windows the commands above are pretty much the same, only run your TestParser like this:

java -cp .;antlr-3.2.jar TestParser

(there's a semi-colon instead of a regular colon)

Bart Kiers
Thanks! Your answer is very usefully for me!
glebreutov
You're welcome.
Bart Kiers