views:

32

answers:

2

Hi,

I'm using JavaCC to build a complex parser. At one point, I would like to skip all the character I see until a desired token in my grammar... let's take for example the following

/* bla bla bla bla bla bla bla bla */ => I would like to define a kind of grammar like

<OPEN_COMMENT> SKIP ~[] until <CLOSE_COMMENT> I want it to be true even if "bla" is a regular token

Thanks for your help

A: 

You can do it using regular expressions.

You can define tokens and a rule as follows:

TOKEN :
{
< #DIGIT : [ "0"-"9" ] >
| < #ALPHABET: ["a" - "z"] >
| < #CAPSALPHABET: ["A" - "Z"] >
| < WORD: ( <DIGIT> | <ALPHABET> | <CAPSALPHABET>)+ >
}

String comment() :
{
  Token token;
}
{
 token=( <WORD> )+
 {
   return token.toString();
 }
}
athena
Yes, that was what I was thinking at first... but the problem is : I'm taking your grammar as a refeence. If I'm adding the TOKEN : `<USELESS_SPEECH> : 'bla'`, the parser will return `<USELESS_SPEECH>` instead of `<WORD>` (depending on which was defined the first). and the compiler will say "carreful, the word "bla" will be matched by token `<USELESS_SPEECH>`
BlackLabrador
@BlackLabrador: I am not able to understand your problem clearly. Can you give more details about your expected output and the output your current parser gives? You want to identify a comment, right? And can you also tell me what was the output of the parser using the grammar in my answer?
athena
A: 

I think the usual procedure here is to use lexical states with MORE and either SKIP or SPECIAL_TOKEN. You can see an example of this in the way comments are handled by the Java grammar that comes with the JavaCC source distribution.

tomcopeland