ansaurus

Question

Answer 1

A:

You can do it using regular expressions.

You can define tokens and a rule as follows:

TOKEN :
{
< #DIGIT : [ "0"-"9" ] >
| < #ALPHABET: ["a" - "z"] >
| < #CAPSALPHABET: ["A" - "Z"] >
| < WORD: ( <DIGIT> | <ALPHABET> | <CAPSALPHABET>)+ >
}

String comment() :
{
  Token token;
}
{
 token=( <WORD> )+
 {
   return token.toString();
 }
}

athena 2010-10-05 15:55:39

Yes, that was what I was thinking at first... but the problem is : I'm taking your grammar as a refeence. If I'm adding the TOKEN : `<USELESS_SPEECH> : 'bla'`, the parser will return `<USELESS_SPEECH>` instead of `<WORD>` (depending on which was defined the first). and the compiler will say "carreful, the word "bla" will be matched by token `<USELESS_SPEECH>`

BlackLabrador 2010-10-06 00:26:47

@BlackLabrador: I am not able to understand your problem clearly. Can you give more details about your expected output and the output your current parser gives? You want to identify a comment, right? And can you also tell me what was the output of the parser using the grammar in my answer?

athena 2010-10-06 14:59:23

Answer 2

A:

I think the usual procedure here is to use lexical states with MORE and either SKIP or SPECIAL_TOKEN. You can see an example of this in the way comments are handled by the Java grammar that comes with the JavaCC source distribution.

tomcopeland 2010-10-07 10:21:03

ansaurus

tags:

views:

answers:

Parser in JavaCC and SKIP instruction

related questions