Is there a way to determine how many capture groups there are in a given regular expression?
I would like to be able to do the follwing:
def groups(regexp, s):
""" Returns the first result of re.findall, or an empty default
>>> groups(r'(\d)(\d)(\d)', '123')
('1', '2', '3')
>>> groups(r'(\d)(\d)(\d)', 'abc')
('', '', '')
"""
import re
m = re.search(regexp, s)
if m:
return m.groups()
return ('',) * num_of_groups(regexp)
This allows me to do stuff like:
first, last, phone = groups(r'(\w+) (\w+) ([\d\-]+)', 'John Doe 555-3456')
However, I don't know how to implement num_of_groups
. (Currently I just work around it.)
EDIT: Following the advice from rslite, I replaced re.findall
with re.search
.
sre_parse
seems like the most robust and comprehensive solution, but requires tree traversal and appears to be a bit heavy.
MizardX's regular expression seems to cover all bases, so I'm going to go with that.