tags:

views:

600

answers:

6

How do I make a python regex like "(.*)" such that, given "a (b) c (d) e" python matches "b" instead of "b) c (d"?

I know that I can use "[^)]" instead of ".", but I'm looking for a more general solution that keeps my regex a little cleaner. Is there any way to tell python "hey, match this as soon as possible"?

+5  A: 

You seek the all-powerful '?'

http://www.amk.ca/python/howto/regex/regex.html#SECTION000730000000000000000

Trey Stout
+5  A: 

Would not "\(.*?\)" work ? That is the non-greedy syntax.

Zitrax
+10  A: 
>>> x = "a (b) c (d) e"
>>> re.search(r"\(.*\)", x).group()
'(b) c (d)'
>>> re.search(r"\(.*?\)", x).group()
'(b)'

According to the docs:

The '*', '+', and '?' qualifiers are all greedy; they match as much text as possible. Sometimes this behavior isn’t desired; if the RE <.*> is matched against '<H1>title</H1>', it will match the entire string, and not just '<H1>'. Adding '?' after the qualifier makes it perform the match in non-greedy or minimal fashion; as few characters as possible will be matched. Using .*? in the previous expression will match only '<H1>'.

Paolo Bergantino
+2  A: 

Do you want it to match "(b)"? Do as Zitrax and Paolo have suggested. Do you want it to match "b"? Do

>>> x = "a (b) c (d) e"
>>> re.search(r"\((.*?)\)", x).group(1)
'b'
David Berger
A: 

Using an ungreedy match is a good start, but I'd also suggest that you reconsider any use of .* -- what about this?

groups = re.search(r"\([^)]*\)", x)
ojrac
+1  A: 

As the others have said using the ? modifier on the * quantifier will solve your immediate problem, but be careful, you are starting to stray into areas where regexes stop working and you need a parser instead. For instance, the string "(foo (bar)) baz" will cause you problems.

Chas. Owens