views:

31

answers:

1

I'm making a generator of LL(1) parsers, my input is a CoCo/R language specification. I've already got a Scanner generator for that input. Suppose I've got the following specification:

COMPILER 1

CHARACTERS

digit="0123456789".

TOKENS
number = digit{digit}. 
decnumber = digit{digit}"."digit{digit}.

PRODUCTIONS

Expression = Term{"+"Term|"-"Term}.      
Term = Factor{"*"Factor|"/"Factor}.       
Factor = ["-"](Number|"("Expression")").
Number = (number|decnumber).

END 1.

So, if the parser generated by this grammar receives a word "1+1", it'd be accepted i.e. a parse tree would be found.

My question is, the character "+" was never defined in a token, but it appears in the non-terminal "Expression". How should my generated Scanner recognize it? It would not recognize it as a token.

Is this a valid input then? Should I add this terminal in TOKENS and then consider an error routine for a Scanner for it to skip it?

How does usual language specifications handle this?

A: 

Anything on the RHS of a grammar rule (that's not part of the grammar notation itself) must be either a nonterminal symbol, or a terminal symbol (synonymous with "token"). So yes, you should make your operators tokens. Looking at the CoCo/R documentation, it seems that it will accept literal strings as terminal symbols in the PRODUCTIONS section, so you may not have to do anything else...the parser generator should already treat them as tokens.

Jim Lewis
Do you mean making them tokens in the input specification or in run-time in background? What's the most common in a parser generator? Thank you.
kmels
@kmels: Well, one way or another, the nonterminals (tokens) must be defined at parser generator time, not runtime. This can probably be done at either the scanner level or parser level. Looking at the CoCo/R documentation, it seems that writing terminals as literal strings in the PRODUCTIONS section, as you have it, is supported. This is also true of YACC and tools derived from it, so that may well be the most common approach.
Jim Lewis
Generator time is what i meant indeed, thank you Jim.
kmels