tags:

views:

402

answers:

1

So I have this regex:

(^(\s+)?(?P<NAME>(\w)(\d{7}))((01f\.foo)|(\.bar|\.goo\.moo\.roo))$|(^(\s+)?(?P<NAME2>R1_\d{6}_\d{6}_)((01f\.foo)|(\.bar|\.goo\.moo\.roo))$))

Now if I try and do a match against this:

B048661501f.foo

I get this error:

  File "C:\Python25\lib\re.py", line 188, in compile
    return _compile(pattern, flags)
  File "C:\Python25\lib\re.py", line 241, in _compile
    raise error, v # invalid expression
sre_constants.error: redefinition of group name 'NAME' as group 9; was group 3

If I can't define the same group twice in the same regex expression for two different cases, what do I do?

+4  A: 

No, you can't have two groups of the same name, this would somehow defy the purpose, wouldn't it?

What you probably really want is this:

^\s*(?P<NAME>\w\d{7}|R1_(?:\d{6}_){2})(01f\.foo|\.(?:bar|goo|moo|roo))$

I refactored your regex as far as possible. I made the following assumptions:

You want to (correct me if I'm wrong):

  • ignore white space at the start of the string
  • match either of the following into a group named "NAME":
    • a letter followed by 7 digits, or
    • "R1_", and two times (6 digits + "_")
  • followed by either:
    • "01f.foo" or
    • "." and ("bar" or "goo" or "moo" or "roo")
  • followed by the end of the string


You could also have meant:

^\s*(?P<NAME>\w\d{7}01f|R1_(?:\d{6}_){2})\.(?:foo|bar|goo|moo|roo)$

Which is:

  • ignore white space at the start of the string
  • match either of the following into a group named "NAME":
    • a letter followed by 7 digits and "01f"
    • "R1_", and two times (6 digits + "_")
  • a dot
  • "foo", "bar", "goo", "moo" or "roo"
  • the end of the string
Tomalak
Perfect thank you.
UberJumper
Be aware: There was one closing paren missing. I just added it.
Tomalak
Multiple named capturing groups with the same name is a very useful feature. Python doesn't support it, but .NET does.
Jan Goyvaerts
Nice to know, I would have been too strict to even try/consider it.
Tomalak