views:

259

answers:

3

What's the difference between this grammar:

...
if_statement : 'if' condition 'then' statement 'else' statement 'end_if';
... 

and this:

...
if_statement : IF condition THEN statement ELSE statement END_IF;
...

IF : 'if';
THEN: 'then';
ELSE: 'else';
END_IF: 'end_if';
....

?

If there is any difference, as this impacts on performance ... Thanks

+1  A: 

The only difference is that in your first production rule, the keyword tokens are defined implicitly. There is no run-time performance implication for tokens defined implicitly vs. explicitly.

Will
+2  A: 

In addition to Will's answer, it's best to define your lexer tokens explicitly (in your lexer grammar). In case you're mixing them in your parser grammar, it's not always clear in what order the tokens are tokenized by the lexer. When defining them explicitly, they're always tokenized in the order they've been put in the lexer grammar (from top to bottom).

Bart Kiers
A: 

The biggest difference is one that may not matter to you. If your Lexer rules are in the lexer then you can use inheritance to have multiple lexer's share a common set of lexical rules.

If you just use strings in your parser rules then you can not do this. If you never plan to reuse your lexer grammar then this advantage doesn't matter.

Additionally I, and I'm guessing most Antlr veterans, are more accustom to finding the lexer rules in the actual lexer grammar rather than mixed in with the parser grammar, so one could argue the readability is increased by putting the rules in the lexer.

There is no runtime performance impact after the Antlr parser has been built to either approach.

chollida