tags:

views:

73

answers:

5

I'm not sure if my subject is correct but here is what I'm wanting to accomplish. I have a regular expression with two groups that are OR'd and I'm wondering if it's possible to have a group be a back reference only if it matched? In all cases, I'm wanting to match spam.eggs.com

Example:

import re

monitorName = re.compile(r"HQ01 : HTTP Service - [Ss][Rr][Vv]\d+\.\w+\.com:(\w+\.\w+\.(?:net|com|org))|(\w+\.\w+\.(?:net|com|org))")

test = ["HQ01 : HTTP Service - spam.eggs.com",
    "HQ01 : HTTP Service - spam.eggs.com - DISABLED",
    "HQ01 : HTTP Service - srv04.example.com:spam.eggs.com",
    "HQ01 : HTTP Service - srv04.example.com:spam.eggs.com - DISABLED"]


for t in test:
    m = monitorName.search(t)
    print m.groups()

Produces:

(None, 'spam.eggs.com')
(None, 'spam.eggs.com')
('spam.eggs.com', None)
('spam.eggs.com', None)

It'd be nice if my groups would only return my one matched group and not both. Hope this makes sense

+1  A: 
m = monitorName.search(t)
g = m.groups()
print g[0] or g[1]
KennyTM
+1  A: 

Use m.group(1) or m.group(2).

Ignacio Vazquez-Abrams
+2  A: 

The | operator has early precedence so it applies to everything before it (from the beginning of your regex in this case) OR everything after it. In your regex, if there is no "srv04.example.com", it isn't checking if the string contains "HTTP Service"!

Your two capturing groups are identical, so there's no point in having both. All you want is to have the srv*: part optional, right?

Try this one:

r"HQ01 : HTTP Service - (?:[Ss][Rr][Vv]\d+\.\w+\.com:)?(\w+\.\w+\.(?:net|com|org))"
Nicolás
Duh! Makes perfect sense. Thank you!
TheDude
A: 

I will rewrite the regular expression to be

monitorName = re.compile(r"HQ01 : HTTP Service - (?:(?i)SRV\d+\.\w+\.com:)?(\w+\.\w+\.(?:net|com|org))")

Produces

('spam.eggs.com',)
('spam.eggs.com',)
('spam.eggs.com',)
('spam.eggs.com',)

You can make group optional by tailing with ?.

livibetter
A: 

Did you consider this?

HQ01 : HTTP Service - (?:[Ss][Rr][Vv]\d+\.\w+\.com:)?(\w+\.\w+\.(?:net|com|org))
Antony Hatchkins