views:

56

answers:

1

Hey,

I have the following regex:

regex = compile("((?P<lastyear>[\dBFUPR]+)/)*((?P<lastseason>[\dBFUPR]+))*(^|-(?P<thisseason>[\dBFUPR]*))")

Which I am using to process horce racing form strings. Sometimes a horses' form will look like this "1234-" meaning that it has not raced yet this season (there are no numbers to the right of the "-").

Currently, my regex will match "" at the end of such a form string in the thisseason group. I do not want this behaviour. I want the group to be None in such a case. i.e.

match = regex.match("1234-")
print match.group("thisseason") #None

Examples

string = "1234/123-12"
match.group("lastyear") #1234
match.group("lastseason") #123
match.group("thisseason") #12

string = "00999F"
match.group("lastyear") #None
match.group("lastseason") #None
match.group("thisseason") #00999F

string = "12-3456"
match.group("lastyear") #None
match.group("lastseason") #12
match.group("thisseason") #3456
+1  A: 

This works:

>>> regex = re.compile(r'(?:(?P<lastyear>[\dBFUPR]+)/)?(?:(?P<lastseason>[\dBFUPR]+)-)?(?P<thisseason>[\dBFUPR]+)?')
>>> regex.match("1234/123-12").groupdict()
{'thisseason': '12', 'lastyear': '1234', 'lastseason': '123'}
>>> regex.match("00999F").groupdict()
{'thisseason': '00999F', 'lastyear': None, 'lastseason': None}
>>> regex.match("12-").groupdict()
{'thisseason': None, 'lastyear': None, 'lastseason': '12'}
>>> regex.match("12-3456").groupdict()
{'thisseason': '3456', 'lastyear': None, 'lastseason': '12'}
SilentGhost
The above does not match anything for "7463-", which isn't correct.
Peter
@Peter: see my edit now.
SilentGhost