tags:

views:

122

answers:

2

Often, I would like to build up complex regexps from simpler ones. The only way I'm currently aware of of doing this is through string operations, e.g.:

Year = r'[12]\d{3}'
Month = r'Jan|Feb|Mar'
Day = r'\d{2}'
HourMins = r'\d{2}:\d{2}'

Date = r'%s %s, %s, %s' % (Month, Day, Year, HourMins)
DateR = re.compile(Date)

Is anybody aware of a different method or a more systematic approach (maybe a module) in Python to have composable regexps? I'd rather compile each regexp individually (e.g. for using individual compile options), but then there doesn't seem to be a way of composing them anymore!?

+2  A: 

You can use Python's formatting syntax for this:

types = {
    "year":           r'[12]\d{3}',
    "month":        r'(Jan|Feb|Mar)',
    "day":            r'\d{2}',
    "hourmins":    r'\d{2}:\d{2}',
}
import re
Date = r'%(month)s %(day)s, %(year)s, %(hourmins)s' % types
DateR = re.compile(Date)

(Note the added grouping around Jan|Feb|Mar.)

Glenn Maynard
That still relies on string operations, right?!
ThomasH
Yep!?(/* padding to work around dumb comment system */)
Glenn Maynard
+1  A: 

You could use Ping's rxb:

year = member("1", "2") + digit*3
month = either("Jan", "Feb", "Mar")
day = digit*2
hour_mins = digit*2 + ":" + digit*2

date = month + " " + day + ", " + year + ", " + hour_mins

You can then match on the resulting date directly, or use

DateR = date.compile()
Martin v. Löwis
That looks like the answer I was looking for, thanks. I will have to check how the module goes about compile options and match groups, but from first sight it looks perferct :-).
ThomasH