views:

75

answers:

3

I'm trying to parse dates with regex, using groups, but python is returning empty lists. I'm not doing anything fancy, just 12/25/10 sort of stuff. I want it to reject 12/25-10 though.

date = re.compile("\d{1,2}([/.-])\d{1,2}\1\d{2}")

I've tried online regex libraries, but their solutions don't seem to run either. Any ideas?

Sample input: "Hello today is 10/18/10, and the time is 10:50am" Hopeful output: "10/18/10"

I'm running Python 2.5.

+5  A: 

Use a raw string:

date = re.compile(r"\d{1,2}([/.-])\d{1,2}\1\d{2}")

Otherwise, the \1 in the string literal is interpreted as the character 1 (Start of Heading).

EDIT: To add groups for the date components, use:

re.compile(r"(\d{1,2})([/.-])(\d{1,2})\2(\d{2})")
Matthew Flaschen
when I try this, all it returns is ['/']
ehfeng
i tried this on python 2.5, 2.6, and 3.1 :(
ehfeng
@ehfeng, that's because you don't have groups for the digits.
Matthew Flaschen
I don't understand - should I add parenthesis around all of the digits as well? I would like the regex to return "12/25/10", if possible.
ehfeng
*furiously reading through the regex tutorial* :)
ehfeng
@ehfeng: what are you doing with your compeiled regex: search? match? findall? finditer? Also show us some sample input.
John Machin
+2  A: 

No doubt overkill, but the "parsedatetime" library has been working for me: http://code.google.com/p/parsedatetime/

It does use regexes internally, but does a lot more than parse MM/DD/YY formats.

Adam Vandenberg
+5  A: 

You should use Python's builtin strptime.

paprika