views:

844

answers:

2

I'm using a NSDateFormatter to parse a RFC 822 date on the iPhone. However, there is no way to specify optional elements in the date format. There are a couple of optional parts in the RFC 822 specification which is breaking the date parser. If nothing works out, I'd probably have to write a custom parser to obey the specs.

For example, the day name is optional in the spec. So both these dates are valid:

Tue, 01 Dec 2009 08:48:25 +0000 is parsed with the format EEE, dd MMM yyyy HH:mm:ss z 01 Dec 2009 08:48:25 +0000 is parsed with the format dd MMM yyyy HH:mm:ss z

This is what I am currently using:

+ (NSDateFormatter *)rfc822Formatter {
    static NSDateFormatter *formatter = nil;
    if (formatter == nil) {
     formatter = [[NSDateFormatter alloc] init];
     NSLocale *enUS = [[NSLocale alloc] initWithLocaleIdentifier:@"en_US"];
     [formatter setLocale:enUS];
     [enUS release];
     [formatter setDateFormat:@"EEE, dd MMM yyyy HH:mm:ss z"];
    }
    return formatter;
}

+ (NSDate *)dateFromRFC822:(NSString *)date {
    NSDateFormatter *formatter = [NSDate rfc822Formatter];
    return [formatter dateFromString:date];
}

And parsing the date as follows:

self.entry.published = [NSDate dateFromRFC822:self.currentString];

One way is to try both formats, and take whatever returns non null value. However, there are two optional parts in the spec (day name and seconds) and there would be 4 possible combinations. Still not too bad, but it's a bit hacky.

+2  A: 

Count the number of salient characters before deciding which formatter to use. For example, the two you give have different numbers of commas and spaces. If no known format matches the counts, then you known not even to try parsing it as a date.

Steve Weller
This seems like the most practical solution to me, given the month and day names are of fixed length and all other values are fixed-length numeric. Far cheaper than trying formats out until one works!
Kendall Helmstetter Gelner
Implemented a basic solution. Not very happy with it, but it's the best one so far :) A comma identifies the weekday's presence, and two colons help identify seconds. It would've been great if the date included a reference to the spec it was following, as date parsing is really cumbersome in many languages with the multitude of formats.
Anurag
+1  A: 

I believe RFC 822 specifies two optional components in the date time: day of week and the seconds past the hour.

As a hack, it is possible to the symbols for the short days of the week:

NSArray *shortWeekSymbols = [NSArray arrayWithObjects:@"Sun,", @"Mon,", @"Tue,", @"Wed,", @"Thu,", @"Fri,", @"Sat,", nil]; [formatter setShortWeekdaySymbols:shortWeekSymbols];

If you then change the date format to this: EEEdd MMM yyyy HH:mm:ss z. You'll be able to parse times with about without the day of the week. This seems to allow a space after the comma too.

To be safe you should not just blindly set the symbols like this. You should get using setShortWeekdaySymbols and iterate over them adding the comma at the end. The reason being they are potentially different for each locale and the first day might not be Sunday.

Interestingly the format EEE, dd MMM yyyy HH:mm:ss z will parse times without the day of week, but the comma must be there, for example , 01 Dec 2009 08:48:25 +0000. Therefore, you could do something like Steve said but then strip off the day and pass though to the formatter. Not having the comma in the format does not seem to allow the week to be optional. Strange.

Unfortunately, this still doesn't help with the optional :ss in the format. But it might allow you to have two formats rather than four.

lyonanderson
Thanks for the tip. I think RFC 822 does not mention localization, and only uses the English format. It's still a good idea to append a comma instead of hard-coding the values though. But since I would still need to check two combinations, it's probably a good idea to check for the characters beforehand instead of trying twice.
Anurag