tags:

views:

111

answers:

4

I would like to know why following regular expression returns true:

reg = re.compile (r'[0-9]%')
reg.search ("50%")

[0-9] would match any single digit, in this case 5. But then 0 doesn't match %, so it should return false, but it returns true.

My code might have syntax errors, but you get the gist of it.

+5  A: 

This regex would match on the 0% portion of 50%.

Thomas Owens
+8  A: 

reg.search() matches the pattern anywhere in the string (so it matches the 0%). If you want the entire string to match, try this:

re.compile(r'^[0-9]%$')

^ - matches the start of the string

$ - matches the end of the string

wow, didn't realize that , thanks guys that was quick
+1  A: 

If you are searching for single-digit percentages inside a longer string, you could use a negative lookbehind:

In [171]: print(re.search('(?<!\d)\d%',"Foo is 5% complete"))
<_sre.SRE_Match object at 0xab302f8>

In [172]: print(re.search('(?<!\d)\d%',"Foo is 50% complete"))
None

In [173]: print(re.search('(?<!\d)\d%',"5% complete"))
<_sre.SRE_Match object at 0xab301a8>

In [174]: print(re.search('(?<!\d)\d%',"50% complete"))
None
unutbu
+1  A: 

As gfdunn2 mentioned, it does a 'rolling-match' of the entire string. There are a couple things you can do to control it a bit better though.

The braces {} below can control how many characters you get, so it will give you much tighter matching.

>>> import re  

#exactly 1 digit and %
>>> test = re.compile(r'[0-9]{1}%')  
>>> print test.search("50%").group(0)  
0%  


#exactly 2 digits and %
>>> test = re.compile(r'[0-9]{2}%')  
>>> print test.search("50%").group(0)  
50%  


#one or more digits  
>>> test = re.compile(r'[0-9]+%')  
>>> print test.search("50%").group(0)  
50%  

#in the event you want to include floating point percentages  
>>> test = re.compile(r'[0-9.]+%')  
>>> print test.search("50.4%").group(0)  
50.4%

>>> print test.search("50.34%").group(0)
50.34%
pyInTheSky