views:

1382

answers:

3

I want to parse dates like these into a datetime object:

  • December 12th, 2008
  • January 1st, 2009

The following will work for the first date:

datetime.strptime("December 12th, 2008", "%B %dth, %Y")

but will fail for the second because of the suffix to the day number ('st'). So, is there an undocumented wildcard character in strptime? Or a better approach altogether?

+6  A: 

Try using the dateutil.parser module.

import dateutil.parser
date1 = dateutil.parser.parse("December 12th, 2008")
date2 = dateutil.parser.parse("January 1st, 2009")

Additional documentation can be found here: http://labix.org/python-dateutil

Greg
+2  A: 

strptime is tricky because it relies on the underlying C library for its implementation, so some details differ between platforms. There doesn't seem to be a way to match the characters you need to. But you could clean the data first:

# Remove ordinal suffixes from numbers.
date_in = re.sub(r"(st|nd|rd|th),", ",", date_in)
# Parse the pure date.
date = datetime.strptime(date_in, "%B %d, %Y")
Ned Batchelder
I'd be worried about what this is going to do to August.
Blair Conrad
That's why I included the trailing comma.
Ned Batchelder
I'd say you're better off adding[\d]{1,2}before your regular expression. After all you want to match suffixes after numbers, right? :-)
Vince
yes, good addition.
Ned Batchelder
+1  A: 

You need Gustavo Niemeyer's python_dateutil -- once it's installed,

>>> from dateutil import parser
>>> parser.parse('December 12th, 2008')
datetime.datetime(2008, 12, 12, 0, 0)
>>> parser.parse('January 1st, 2009')
datetime.datetime(2009, 1, 1, 0, 0)
>>>
Alex Martelli