views:

60

answers:

2

Hi all,

I really have some troubles to cauculate the lookahead when building the LR(1) item sets, i had tried some lecture notes form different sites, but still... My example is

S -> E + S | E
E -> num | ( S )

The item set is

I0:
S’ -> . S       $
S -> . E + S    $
S -> . E        $
E -> . num      +,$
E -> . ( S )    +,$

I1:
S ->E .+ S      $
S ->E .         $

The first item in set I0

S’ -> . S     $

is initialization.

The second item in set I0

S -> . E + S     $

means there is nothing on stack, we expect to read E+S, then reduce iff the token after E+S is $.

The third item in set I0

S -> . E        $

means that we expect to read E and reduce iff the token after E is $.

Then i am confused about the fouth item in set I0,

E -> . num      +,$

I have no ideas why there are + and $ tokens.

and if anyone can explain this for me in plain English please. For each configuration [A –> u•Bv, a] in I, for each production B –> w in G', and for each terminal b in First(va) such that [B –> •w, b] is not in I: add [B –> •w, b] to I.

Thanks!!!

A: 

For

E -> . num      +,$
E -> . ( S )    +,$

the +,$ indicate that only these tokens can follow a number or a closing parenthesis. Think about it: The grammar does noty allow adjacent num's or ()'s, they must either be at the end of the sentence or followed by a +.

As for translation request, it is a fancy way of saying how to calculate the set of tokens that can follow a given token. The +,$ above are an example. They are the only legal tokens that can follow num and ).

Richard Pennington
Thanks Rachard,do you mean that for itemE ->.num +,$ and item E ->.(S) +,$The lookahead tokens are actually FOLLOW(E).if it is, then why the set of items in I1 is only the subset of FOLLOW(S)?Thanks again.
A: 

I think i figured it out. i am using the algorithm of

for set I0:
Begin with [S' -> .S, $]
Match [A -> α.Bβ, a]
Then add in [B -> .γ, b]
Where terminal b is FIRST(βa)

for set I1...In
Compute GOTO(I0,X)
Add in X productions and LOOKAHEAD token

In the example

S -> E + S 
S -> E
E -> num 
E -> ( S )

Firstly,

S’ -> . S       $

we try to match it to [A -> α.Bβ, a], That is A =S', α = ε, B = S , β = ε , a = $ and FIRST(βa) = {$} Add in [B -> .γ, b], which are

S -> . E + S    $                     ...1
S -> . E        $                     ...2

in I0.

Then, we need to add in productions for E as 1 and 2. In this case, our [A -> α.Bβ, a] are 1 and 2. Thus, FIRST(βa) = { + , $ }, and we have

E -> . num      +,$
E -> . ( S )    +,$

Now, we compute GOTO(I0, X) For X = E we move dot one position and found no productions need to be added. So we just add in second component $ from

S -> . E + S    $
S -> . E        $

which gives us I1

S ->E .+ S      $
S ->E .         $

and so on...

So, is this the correct and efficient way when building LR(1) item sets?