views:

76

answers:

1

I'm using Text.ParserCombinators.Parsec and Text.XHtml to parse an input like this:

 
- First type A\n
-- First type B\n
- Second type A\n
-- First type B\n
--Second type B\n

And my output should be:

 
 
<h1>1 First type A\n</h1>
<h2>1.1 First type B\n</h2>
<h1>2 Second type A\n</h2>
<h2>2.1 First type B\n</h2>
<h2>2.2 Second type B\n</h2>
 

I have come to this part, but I cannot get any further:

 
 
title1= do{     
                ;(count 1 (char '-'))
                ;s <- many1 anyChar newline
                ;return (h1 << s)
    }

title2= do{     
                ;(count 2 (char '--'))
                ;s <- many1 anyChar newline
                ;return (h1 << s)
    }

text=do {
        ;many (choice [try(title1),try(title2)])
 }

main :: IO ()
main = do t putStr "Error: " >> print err
            Right x  -> putStrLn $ prettyHtml x

 

This is ok, but it does not include the numbering.

Any ideas?

Thanks!

+3  A: 

You probably want to use GenParser with a state containing the current section numbers as a list in reverse order, so section 1.2.3 will be represented as [3,2,1], and maybe the length of the list to avoid repeatedly counting it. Something like

data SectionState = SectionState {nums :: [Int], depth :: Int}

Then make your parser actions return type be "GenParser Char SectionState a". You can access the current state in your parser actions using "getState" and "setState". When you get a series of "-" at the start of a line count them and compare it with "depth" in the state, manipulate the "nums" list appropriately, and then emit "nums" in reverse order (I suggest keeping nums in reverse order because most of the time you want to access the least significant item, so putting it at the head of the list is both easier and more efficient).

See Text.ParserCombinators.Parsec.Prim for details of GenParser. The more usual Parser type is just "type Parser a = GenParser Char () a" You probably want to say

type MyParser a = GenParser Char SectionState a

somewhere near the start of your code.

Paul Johnson