tags:

views:

454

answers:

3

I've recently started to play around with grammatical analyzers using javacc and one of the fields is the options one...I have a code like the folowing:

options
{
  LOOKAHEAD=1;
}
PARSER_BEGIN(Calculator)

public class Calculator
{
 ...
}
PARSER_END(Calculator)

What exactly does it mean the LOOKAHEAD option? Thanks

+1  A: 

See http://en.wikipedia.org/wiki/Lookahead#Lookahead_in_parsing

Normally, the parser only looks at the next token to determine what production rule to apply. However, in some cases that is not enough to make the choice. For example, given two production rules:

p0: foo -> identifier "=" expr
p1: bar -> identifier "(" arglist ")"

If the next token is of type identifier then the parser cannot determine if it should use the foo or bar production. JavaCC will then give an error, saying it needs to use more look-ahead. Changing the look-ahead to 2 means the parser is allowed to look at the next two tokens, which in this case is sufficient to choose between the productions.

As Steve pointed out, this is in the javacc docs: https://javacc.dev.java.net/doc/lookahead.html

Hans W
A: 

The LOOKAHEAD value tells the generated parser how many unprocessed (i.e., future) tokens to use to decide what state to transition to. In a tightly-constrained language, only one lookahead token is necessary. The more ambiguous a language is, the more lookahead tokens are needed to determine which state transition to make.

I think this is covered in the javacc(1) tutorial.

Steve Emmerson
A: 

JavaCC create recursive decent parsers. This type of parsers work by looking at the next symbol to decide what rule to choose. By default they look at only the next symbol (lookahead=1). But you can configure the parser to look not only on the next but look at the next N symbols. If you set lookahead to 2 the generated parser will look at the next two symbols to decide what rule to choose. This way you can define your grammer mor natural (but at the cost of performance, the bigger the lookahead, the more the parser willl have to do).

If you set the general lookahead to a bigger number your parser will be slower for all inputs (for non trivial grammars). You can use lookahead local if you want to let the parser with lookahead=1 by default and use a bigger lookahead only on specific situtations.

http://www.engr.mun.ca/~theo/JavaCC-FAQ/javacc-faq-moz.htm#tth_sEc4.5

For example a parser with lookahead=1 can't decide which of the rules (1 or 2) to take, with lookahead=2 it can:

void rule0() : {} { 
  <ID> rule1() 
| <ID> rule2()
}

You can change the difinition of the grammar to get the same result but use lookahead=1:

void rule0() : {} { 
  <ID> ( rule1() | rule2() )
}
Arne