tags:

views:

205

answers:

2

I want whitespaces to be normal whitespaces along with newlines. But i also want to tell when i have a newline so i can do special cases. When i attempt to do this i get

cannot distinguish between: Whitespace MyNewLine Is there a way around this? My grammar is

! ------------------------------------------------- Sets

{WS}           = {Whitespace} - {CR} - {LF}
{ID Head}      = {Letter} + [_]
{ID Tail}      = {Alphanumeric} + [_]
{String Chars} = {Printable} + {HT} - ["\]

! ------------------------------------------------- Terminals

! The following defines the Whitespace terminal using the {WS}
! set - which excludes the carriage return and line feed 
! characters

Whitespace    = {WS}+ | {CR}{LF} | {CR} | {LF}
!NewLine       = {CR}{LF} | {CR} | {LF}
MyNewLine      = {CR}{LF} | {CR} | {LF}
+3  A: 

I think the grammar is ambiguous in the sense that both Whitespace and MyNewLine match new line charachters. Since it throws a wobbly doing it your way, I suggest detecting whitespace and new lines separately and deciding what to do with the newline on a case by case basis.

I am not too experienced in the area, but thats what I remember from my Theory Of Computation class and Compiler Design class.

I hope this helps.

batbrat
+3  A: 

They are ambiguous because they both contain the same sub-set {CR}{LF} | {CR} | {LF}.

Given the input {CR}{LF} the parser has no way to tell which terminal it should match.

A table-driven parser isn't really designed to handle "special cases" directly. If you want to ignore new-lines in some contexts, but attribute meaning to them in others then you'll have to handle that in your reductions (i.e. tokenize the newlines separately, and discard them in your reductions), but that will get ugly.

A (potentially) better solution is to use tokenizer states (possibly controlled from the parser), to change how the newline inputs are tokenized. It's hard to say without fully understanding your grammar. Plus, it's been a few years since I've messed with this stuff.

Brannon