tags:

views:

303

answers:

4

I have a list that I need to parse where the all but the last element needs to be parsed by one parser, and the last element needs to be parsed by another parser.

a = "p1 p1b ... p2"
or
a = "p2"

Originally I tried

parser = do parse1 <- many parser1
            parse2 <- parser2
            return AParse parse1 parse2

The problem is that parse1 can consume a parse2 input. So parse1 always consumes the entire list, and leave parse2 with nothing.

Is there a way to say to apply parse1 to everything besides the last element in a string, and then apply parse2?

+2  A: 

If you can factor parser1 so that is defined like so:

parser1 = (try parser2) <|> parser1extra

Then the problem becomes a list of parser1extra or parser2 that must end in the later. You can code that as:

parserList =
    liftM2 (:) (try parser1extra) parserList
    <|>
    liftM2 (:) (try parser2) (option [] parserList)

You may or may not need the try calls depending on if those parsers have any prefix overlap.

If you don't want the return value to be a list, but instead your AParse datum, then you could re-write it this way:

parserList =
    do
        a <- try parser1extra
        prefix a parserList
    <|>
    do
        a <- try parser2
        option (AParse [] a) (prefix a parserList)

    where prefix a p = do
            (AParse as t) <- p
            return $ (AParse (a:as) t)

Or, a full example:

import Control.Monad
import Text.ParserCombinators.Parsec

parseNum = do { v <- many1 digit; spaces; return v }
parseWord = do { v <- many1 letter; spaces; return v }
parsePart = parseNum <|> parseWord

parsePartListEndingInWord =
    liftM2 (:) (try parseNum) parsePartListEndingInWord
    <|>
    liftM2 (:) (try parseWord) (option [] parsePartListEndingInWord)

Actually, the calls to try aren't needed in this case, as parseNum and parseWord match no common prefix. Notice that parsePartListEndingInWord doesn't actually reference parsePart, but instead, the two options that make up parsePart's definition


(Original answer, solving a somewhat different situation:)

How about something like:

parserTest = between (char '[') (char ']') $ do
    p1s <- try parser1 `endBy` char ',' 
    p2 <- parser2
    return $ AParse p1s p2

Taking the punctuation out of your parsers and up into parseTest allows you to use the combinators between and endBy to do the work for you. Lastly, the try is there so that if parser1 and parser2 match a common prefix, endBy will perform the correct full backup to beginning of the common prefix.

Depending on your parsers, it is possible that you can leave the punctuation matching inside your sub-parsers, and all you need might be the a try around parser1:

parseTest = do parse1 <- many (try parser1)
               parse2 <- parser2
               return AParse parse1 parse2
MtnViewMark
I made a mistake in the question and stated that I had a list of items. I should of said that I had a string of items. Sorry about that. I made the correction in the question.The second example in the code won't work because parser1 will consume the whole string.
Chris
Thanks for the code.Wouldn't parser1extra just consume all of the string?
Chris
The idea was that parser1extra would only parse those things that should be in parser1, but don't match parser2. Hence, parser1extra only matches where parser2 doesn't.
MtnViewMark
+2  A: 

How about:

parseTrain car caboose = choice
    [ fmap (:[]) $ try (caboose `endBy` eof), 
    , liftM2 (:) car (parseTrain car caboose) 
    [

The eof bugs me, since that makes this parser not compositional. I.e. you couldn't say:

char '(' >> parseTrain p1 p2 >> char ')'

Doing this compsitionally is very hard for a parser. How is it supposed to know to move on to char ')', without trying to at every opportunity and seeing if it fails? Doing so could exponential time.

If you need it to be compositional, does your problem have some additional structure you can exploit? Can you, for example, parse a list of all elements and then process the last one after the fact?

luqui
A: 

I kind of combined the two approaches:

parserList = try (do a <- parser2
                     eof
                     return $ AParse [] a)
             <|>
             do a <- parser1
                prefix a parserList
             where
                prefix a p = do
                    (AParse as t) <- p
                    return $ AParse a:as t

I think that this will work for my purposes. Thanks!

Chris
A: 

This will do the trick:

parser1 `manyTill` (try parser2)
ADEpt