views:

214

answers:

4

I am trying to write a class that can parse an iCalendar file and am hitting some brick walls. Each line can be in the format:

PARAMETER[;PARAM_PROPERTY..]:VALUE[,VALUE2..]

It's pretty easy to parse with either a bunch of splits or regex's until you find out that values can have backticked commas, also they can be double quote marked which makes life hard. for example:

PARAMETER:"my , cool, value",value\,2,value3

In this example you are meant to pull out the three values:

  • my , cool value
  • value,2
  • value3

Which makes it a little more difficult.

Suggestions?

+2  A: 

Go through the file char by char and split the values manually, whenever you have a quotation mark you enter "quotation mode" where you won't split at commas and when the closing quotation mark comes you leave it.

For the backticked commas: If you read in a backslash you also read the next character and decide what to do with it then.

Of course that's not extremely efficient, but you can't use regular expressions for this. I mean you can, but since I believe that there also can be escaped quotation marks this is going to be really messy.

If you want to give it a try though:

  • let's start by matching a quotation mark followed by characters that are not: "[^"]*"
  • to overcome the problem of escaped characters you can use lookaheads (?<!\\)"[^"]*(?<!\\)"
  • now it will break if escaped quotation marks are in the value, maybe this works?(haven't tested it) (?<!\\)"[^"|(?<=\\)"]*(?<!\\)"

So you see it very fast get's messy, so I would suggest to you to read it in characterwise.

André Hoffmann
I agree, that looks like it's getting a little messy with the regexes. Will give this a shot when I get back to work and let you know how I went (now really wishing I had taken compilers and parsers at uni instead of formal proofs of code)
Matt Wheeler
A: 

Have you tried pulling something out of http://phpicalendar.net/ ?

therefromhere
No I took a look at that but didn't get too far. I did however manage to use some of the code from a project qcalendar (which I can't find on google at the moment, will update this when I am back at work) which seems to do a decent job.
Matt Wheeler
+1  A: 

I had the same problems. I found it a bit hard to turn 'any' iCalendar file into a usable PHP object/array structure, so instead I've been trying to convert iCalendar to xCal.

This is my implementation:

http://code.google.com/p/sabredav/source/browse/branches/caldav/lib/Sabre/CalDAV/ICalendarToXML.php

I must say that this script is not fully tested, but it might be enough to get your started.

Evert
interesting approach, it's a shame that places like google calendar don't provide xcal feeds.
Matt Wheeler
A: 

Is this the project you're thinking of? I'm the auther :) The first usable version (v0.1.0) should be ready in about a month. It is capable of working with about 85% of the iCalendar spec right now, but recurring events are really tough. I'm working on them right now. Once those are complete, the library will be fully capable of doing anything in the spec.

qCal Google Code Homepage

Enjoy!

Luke Visinoni