views:

91

answers:

2

I currently have setup a Python script that uses feedparser to read a feed and parse it. However, I have recently come across a problem with the date parsing. The feed I am reading contains <modified>2010-05-05T24:17:54Z</modified> - which comes up in Python as a datetime object - 2010-05-06 00:17:54. Notice the discrepancy: the feed entry was modified on the 5th of may, while python reads it as the 6th.

So the question is why this is happening. Is the ATOM feed (that is, the one who created the feed) wrong by putting the time as 24:17:54, or is my python script wrong in the way it treats it.

And can I solve this?

A: 

There are some interesting special cases in the rfc here (http://tools.ietf.org/html/rfc3339), however, typically its for the 00:00:60 vs 00:00:59 to allow for leap seconds. It may be though that that is legal. My guess is that its doing the "right thing". In all honesty, date/time things get really messy due to things like DST and local timezones. If its 24:17:54, that might be the right thing after all.

xyld
So assuming it is doing the right thing, how can I correct my python script to handle this?
Joseph
@Joseph, I wouldn't do anything, as Python seems to be doing the right thing. If it's really important, you need to write down the time zone that the feed is using, the time zone that python is generating, and the time zone that you really want, then use the datetime library to modify things correctly. But unless you know what units you are using (i.e. what time zones you are using) then you won't be able to solve the problem.
wisty
@joseph agreed with wisty, python is doing the right thing, there really is nothing to fix.
xyld
@wisty True: but it's a very specific case; it's only at the 00/24 hour. The rest of the time, all dates are parsed perfectly. So I guess I would have to add that logic into my script, if I want to correct it. Thanks all
Joseph
A: 

I think today at 24:17 is intelligently parsed as tomorrow at 00:17.... I'm thinking you are well handling the producer's bug.

joeslice
Not quite: I know for a fact the producer really means 00:17 today. Tomorrow would be in the future...
Joseph