ansaurus

Question

Answer 1

A:

This won't help you understand where you're going wrong, but I'd suggest looking into using sepBy1 to parse types separated by -> symbols. This will give you a list of parsed types, which you can then turn back into function types afterward.

kvb 2010-03-08 19:55:20

Yeah, I figured I would do that eventually, but since sepBy1 probably involves the same recursions I'd have to write manually, I thought I'd start with the simpler grammar.

Nathan Sanders 2010-03-09 15:47:48

@Nathan - Yes, even if you use sepBy1, you'll still need to use something like Christopher's approach to break the recursion.

kvb 2010-03-09 16:09:34

Answer 2

+3 A:

I think the problem is that, and I'm making an assumption for F# (because I don't know it), arrows are right associative. I'm not sure how precise the linked grammar is supposed to be, as I'm not well versed in different grammars. But if we can assume arrows are right associative that makes the problem easier.

So with that assumption we can trivially do:

identP = many1 (digit <|> letter <|> char '.' <|> char '`')

typeP = try arrowP <|> identP

arrowP = do
  i <- identP
  string "->"
  t <- typeP
  return $ "(" ++ i ++ " -> " ++ t ++ ")"

run = flip parse "F# type syntax" $ do
        t <- typeP
        eof
        return t

So:

Haskell> run "int"
Right "int"
Haskell> run "int->int"
Right "(int -> int)"
Haskell> run "int->int->int->int"
Right "(int -> (int -> (int -> int)))"

Expanding further, what might be confusing you is that in that grammar it says type -> type, which means you could have an arrow on the left side. That's fine, but it needs to be in parentheses. Which helps, maybe seeing the following in action is helpful. It helped me.

typeP = try arrowP <|> parens typeP <|> identP

arrowP = do
 i <- parens typeP <|> identP
 string "->"
 t <- typeP
 return $ "(" ++ i ++ " -> " ++ t ++ ")"

parens p  = between (char '(') (char ')') p

Now we can write arrows on the left or the right side of an arrow:

Haskell> run "int->int->int"
Right "(int -> (int -> int))"
Haskell> run "(int->int)->int"
Right "((int -> int) -> int)"

Christopher Done 2010-03-08 19:58:39

Good explanation. As you note, the root of the problem is that you need to break the cycle where arrowP can descend into typeP, which itself can descend into typeP. I think that your `parens` example is particularly illuminating.

kvb 2010-03-08 20:41:07

So Parsec grammars have basically the same non-compositional problem that LR(1) grammars do, in that you have to plan your entire grammar such that the left edge of every rule eventually rewrites to an unambiguous literal. Oh well, I guess I should have known better than to assume Parsec was magic.

Nathan Sanders 2010-03-09 15:46:22

Answer 3

+2 A:

I think you should factor the left recursion out of the grammar. Instead of

type ::= identifier | type -> type 
identifier ::= [A-Za-z0-9.`]+

you get something like

typeStart ::= identifier 
type ::= typeStart (-> type)?
identifier ::= [A-Za-z0-9.`]+

Then this will be easier to translate directly into parsec, I think. (One would think that try would work, and I expect it somehow does, but yes, my experience also was that I had to be at least waist-deep in Parsec before I ever understood "where to put try" to make things work.)

Consider seeing also Monadic Parser Combinators in F# (as well as the 7 prior C# blog entries) for some basics. I think the parsec docs (try just reading them top down, they are decent, if I recall correctly) as well as some of the examples in the various research papers talk about issues like the one in your question.

Brian 2010-03-09 00:42:56

You're right, the research papers are probably the best bet for answering this definitively.I've [implemented toy parser combinators](http://sandersn.com/blog//index.php/2009/07/05/monadic_parsing_in_c_part_5), I just haven't written many grammars with them. I assumed that Parsec would magically be smarter than the hoky examples I wrote in Python and C#.

Nathan Sanders 2010-03-09 15:58:24

ansaurus

tags:

views:

answers:

Parsec: backtracking not working

related questions