views:

744

answers:

1

Hi,

I'm doing Cay Horstmann's combinator parser exercises, I wonder about the best way to distinguish between strings that represent numbers and strings that represent variables in a match statement:

def factor: Parser[ExprTree] = (wholeNumber | "(" ~ expr ~ ")" | ident) ^^ {
    case a: wholeNumber  => Number(a.toInt)
    case a: String => Variable(a)
}

The second line there, "case a: wholeNumber" is not legal. I thought about a regexp, but haven't found a way to get it to work with "case".

+5  A: 

I would split it up a bit and push the case analysis into the |. This is one of the advantages of combinators and really LL(*) parsing in general:

def factor: Parser[ExprTree] = ( wholeNumber ^^ { Number(_.toInt) }
                               | "(" ~> expr <~ ")" 
                               | ident ^^ { Variable(_) } )

I apologize if you're not familiar with the underscore syntax. Basically it just means "substitute the nth parameter to the enclosing function value". Thus { Variable(_) } is equivalent to { x => Variable(x) }.

Another bit of syntax magic here is the ~> and <~ operators in place of ~. These operators mean that the parsing of that term should include the syntax of both the parens, but the result should be solely determined by the result of expr. Thus, the "(" ~> expr <~ ")" matches exactly the same thing as "(" ~ expr ~ ")", but it doesn't require the extra case analysis to retrieve the inner result value from expr.

Daniel Spiewak
Excellent! Had to change {Number(_.toInt)} to {x:String => Number(x)} since I got "error: missing parameter type for expanded function", then it worked like a charm. Still curious if there is a case class way of solving it though.
Lars Westergren
Well, actually case just defines a partial function. It lets you do pattern matching on the input, which is really why it's useful. I could just as easily have written my answer using partial functions (case) instead, it just wasn't necessary. :-) (it would have been except for ~> and <~)
Daniel Spiewak
If you mean matching on the whole term though, then I think the answer is "no, there is no way to do it". Unless the `wholeNumber` method returns a Parser with a different component type than String, there's really no way to differentiate it from `ident` or even "(" ~> expr <~ ")".
Daniel Spiewak
Just realized that I wasn't particularly clear... Partial functions (defined using case) are precisely the same as plain-old functions, they just allow pattern matching on the input. Thus: { x => x.toInt } is the same as { case x => x.toInt }.
Daniel Spiewak
I have solved the original problem thanks to Daniel, but I was still curious about using pattern matching with Regexps, and from reading the book and googling it seems there is no way to do that.
Lars Westergren