views:

40

answers:

2

I am working with iCal entries:

BEGIN:VEVENT
UID:944f660b-01f8-4e09-95a9-f04a352537d2
ORGANIZER;CN=******
DTSTART;TZID="America/Chicago":20100802T080000
DTEND;TZID="America/Chicago":20100822T170000
STATUS:CONFIRMED
CLASS:PRIVATE
X-MICROSOFT-CDO-INTENDEDSTATUS:BUSY
TRANSP:OPAQUE
X-MICROSOFT-DISALLOW-COUNTER:TRUE
DTSTAMP:20100802T212130Z
SEQUENCE:0
END:VEVENT

BEGIN:VEVENT
UID:aa132e2b-8a8d-4ffc-9e54-b75249e78c72
RRULE:FREQ=DAILY;COUNT=12;INTERVAL=1
SUMMARY:***********
X-ALT-DESC;FMTTYPE=text/html:<html><body><div style='font-family:Times New R
 oman\; font-size: 12pt\; color: #000000\;'></div></body></html>
LOCATION:Map Room
ORGANIZER;CN=*********
DTSTART;TZID="America/Chicago":20100730T080000
DTEND;TZID="America/Chicago":20100730T170000
STATUS:CONFIRMED
CLASS:PUBLIC
X-MICROSOFT-CDO-INTENDEDSTATUS:BUSY
TRANSP:OPAQUE
X-MICROSOFT-DISALLOW-COUNTER:TRUE
DTSTAMP:20100727T025231Z
SEQUENCE:0
EXDATE;TZID="America/Chicago":20100810T080000
EXDATE;TZID="America/Chicago":20100807T080000
BEGIN:VALARM
ACTION:DISPLAY
TRIGGER;RELATED=START:-PT5M
DESCRIPTION:*********
END:VALARM
END:VEVENT

I need to parse out starting and ending times. I have a comparison function that determines if the passed in event is between the two times. Due to the increased complexity in calculating the times I plan on not supporting the recurrance series. I would like to play the safe side and make sure my code only reads the first event as a match and not the second. So I have the following RegEx with the single-line option:

BEGIN:VEVENT.+?
DTSTART;.+?([0-9]{8})T([0-9]{6})
DTEND;.+?([0-9]{8})T([0-9]{6}).+?
END:VEVENT

This gets me the start and end times of both entries. My thought was to only match ones that don't have FREQ= between the BEGIN:VEVENT and DTSTART. I don't understand how to do this, however. I was wondering if someone could help me out here?

I realize at a certain point a full blown parser is a better option, but I am unskilled with parsers and I am under a slight time constraint. I have tried using the !? operator without success.

+1  A: 

It's harder to write a regex to match for things you don't want then to match the things you do want. Usually when I run into this situation, I find it easier and faster to do things in two steps. In this case, I'd probably find all events that do contains FREQ=, remove those events, then continue matching on the result for the start and end times I want. Could you post the regex you tried with !?, because maybe it's easy to fix... Also, I assume this is in Objective-C, and I'm guessing the environment you're using does support !? (but not all of them do)...

UPDATE

Ok, try this one:

BEGIN:VEVENT.+?
(?<!FREQ=.+)DTSTART;.+?([0-9]{8})T([0-9]{6})
DTEND;.+?([0-9]{8})T([0-9]{6}).+?
END:VEVENT
FrustratedWithFormsDesigner
Line 1 changes to `BEGIN:VEVENT.+?(!?FREQ=).+?` when I tried with the !? operator. I am using PHP, but it should be PCRE compatible with a / before and after with an s option after the second I think.
Josh
@Josh: Added modified regex. Works for me in Expresso but Expresso is .NET-based and I don't have a PHP test environment
FrustratedWithFormsDesigner
Nice work. That works perfectly. Thank you!
Josh
I jumped too soon, I get a variable length lookbehind assertion error with PHP. Apparently look-behinds have to be fixed length in PHP. I was using a RegEx Engine for design that also apparently relied on .NET.
Josh
@Josh: so... which expression did you end up using?
FrustratedWithFormsDesigner
I ended up doing a capture on all data between the start and end tags, and when I got to using the data I did a quick search to detect `FREQ` in the capture. It's a two step process, and less elegant, but I guess it works. I had some success with doing conditionals, but the PHP PCRE engine is apparently picky about those as well.
Josh
A: 

Why not use a PHP iCalendar parser?

http://www.phpclasses.org/browse/file/16660.html

Doug