views:

153

answers:

3

I am trying to use Parsec to parse something like this:

property :: CharParser SomeObject
property = do
    name
    parameters
    value
    return SomeObjectInstance { fill in records here }

I am implementing the iCalendar spec and on every like there is a name:parameters:value triplet, very much like the way that XML has a name:attributes:content triplet. Infact you could very easily convert an iCalendar into XML format (thought I can't really see the advantages).

My point is that the parameters do not have to come in any order at all and each paramater may have a different type. One parameter may be a string while the other is the numeric id of another element. They may share no similarity yet, in the end, I want to place them correctly in the right record fields for whatever 'SomeObjectInstance' that I wanted the parser to return. How do I go about doing this sort of thing (or can you point me to an example of where somebody had to parse data like this)?

Thankyou, I know that my question is probably a little confused but that reflects my level of understanding of what I need to do.

Edit: I was trying to avoid giving the expected output (because it is large, not because it is hidden) but here is an example of an input file (from wikipedia):

BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//hacksw/handcal//NONSGML v1.0//EN
BEGIN:VEVENT
UID:[email protected]
DTSTAMP:19970714T170000Z
ORGANIZER;CN=John Doe:MAILTO:[email protected]
DTSTART:19970714T170000Z
DTEND:19970715T035959Z
SUMMARY:Bastille Day Party
END:VEVENT
END:VCALENDAR

As you can see it contains one VEvent inside a VCalendar, I have made data structures that represent them here.

I am trying to write a parser that parses that type of file into my data structures and I am stuck on the bit where I need to handle properties coming in any order with any type; date, time, int, string, uid, ect. I hope that makes more sense without repeating the entire iCalendar spec.

+1  A: 

Ok, so between BEGIN:VEVENT and END:VEVENT, you have many key value pairs. So write a rule keyValuePair that returns (key, value). Now inside the rule for VEVENT you do many KeyValuePair to get a list of pairs. Once you've done that you use a fold to populate a VEVENT record with the given values. In the function you give to fold, you use pattern matching to find out in which field to store the value. As the starting value for the accumulator you use a VEvent record where the optional fields are set to Nothing. Example:

pairs <- many keyValuePairs
vevent = foldr f (VEvent {sequence = Nothing}) pairs
    where f ("SUMMARY", v) ve = ve {summary = v}
          f ("DSTART", v) ve = ve {dstart = read v}

...and so on. Do the same for the other components.

Edit: Here's some runnable example code for the fold:

data VEvent = VEvent {
        summary :: String,
        dstart :: String,
        sequenceSt :: Maybe String
        } deriving Show

vevent pairs = foldr f (VEvent {sequenceSt = Nothing}) pairs
    where f ("SUMMARY", v) ve = ve {summary = v}
          f ("DSTART", v) ve = ve {dstart = v}
          f ("SEQUENCEST", v) ve = ve {sequenceSt = Just v}

main = do print $ vevent [("SUMMARY", "lala"), ("DSTART", "lulu")]
          print $ vevent [("SUMMARY", "lala"), ("DSTART", "lulu"), ("SEQUENCEST", "lili")]

Output:

VEvent {summary = "lala", dstart = "lulu", sequenceSt = Nothing}
VEvent {summary = "lala", dstart = "lulu", sequenceSt = Just "lili"}

Note that this will produce a warning when compiled. To avoid the warning, initialize all non-optional fields to undefined explicitly.

sepp2k
This is similar to what I thought of at first but how do you decide what 'zero' is for example? You gave the 'zero' as 'VEvent {sequence = Nothing}' but what if the structure is more complicated like one of the ones that I made, what if it has a bunch of integer fields that have to be set and you don't want to make every record field a Maybe. Can you make records piece by piece with multiple required fields that you do not give at the start? (Let me know if I need to describe this new question better and I'll try to provide a code example)
Robert Massaioli
@Robet: As I said, initialize the optional ones to `Nothing`. The other ones you don't need to initialize at all. If there were no optional ones `Vevent {}` would be a perfectly fine starting value (or you could explicitly set every non-optional value to undefined to avoid getting a warning, but I suppose that would get tiresome with a lot of fields).
sepp2k
This fails for me though? http://pastie.org/1157529
Robert Massaioli
@Robert: Since you made the type of `dstart` String, you need to remove the `read`. Also if you want to see the result, you need to add `deriving Show` to the declaration of VEvent. Once I performed those changes, your pastie worked as expected in ghci.
sepp2k
Oh, I just read your comment again and setting everything to undefined works. And the spec says that leaving fields blank should set them to undefined but it does not for me. Would you know why that is?
Robert Massaioli
@Robert: If the example I just edited in does not work for you, something strange is going on. If it does work, you must have done something different when you tested it.
sepp2k
@Robert: Note that not initalizing all fields, will not work if you compile with `-Werror`, so in that case you need to explicitly spell out `undefined` for all non-optional fields.
sepp2k
Something weird is going on. :O I copied and pasted your code exactly into vim and then loaded it with ghci; look what I got: http://pastie.org/1157548
Robert Massaioli
Oh, the ghc thing is only a warning...im a clown... it makes sense because setting the fields to undefined does work. Thankyou so much for your time. :D Problem solved and marking as answer.
Robert Massaioli
@Robert: The warning is normal. Look at the bottom line: "Ok, modules loaded: Main.". So you can use the code just fine. As I said: if you want to avoid the warning, you need to spell out undefined for all non-optional fields.
sepp2k
Sorry to change the selected answer but I have just looked at the way that Steven Tetley suggested and that way is even better than what we were doing. Thankyou for all the help.
Robert Massaioli
+1  A: 

Here a similar problem is solved: http://zeroindexed.com/parameterizing-haskell-programs

Roman Cheplyaka
+6  A: 

Parsec has the Parsec.Perm module precisely to parse unordered but linear (i.e. at the same level in the syntax tree) elements such as attribute tags in XML files.

Unfortunately the Perm module is mostly undocumented. The best reference is the Parsing Permutation Phrases paper which the Haddock doc page refers to, but even that is largely a description of the technique rather than how to use it.

Stephen Tetley
Wow you are right, that is excellent advice. I am going to read up on that one as soon as I can and if it turns out to be exactly what I need then I may even write a blog post on it.
Robert Massaioli
There are two examples in section 5 of Parsing Permutation Phrases. The original Parsec manual also has one trivial example - it might still be helpful as the combinator names differ between Parsec and the PPP paper. The Parsec manual is available here: http://legacy.cs.uu.nl/daan/parsec.html
Stephen Tetley