tags:

views:

54

answers:

2

Hi everyone

While trying to learn a little more about regular expressions, a tutorial suggested that you can use the \b to match a word boundary. However, the following snippet in the Python interpreter does not work as expected:

>>> x = 'one two three'
>>> y = re.search("\btwo\b", x)

y should have been a match object if anything was matched, but it is None. Is the \b expression not supported in Python or am I using it wrong?

thanks for any help.

+5  A: 

Why don't you try

word = 'two'
re.compile(r'\b%s\b' % word, re.I)

Output:

>>> word = 'two'
>>> k = re.compile(r'\b%s\b' % word, re.I)
>>> x = 'one two three'
>>> y = k.search( x)
>>> y
<_sre.SRE_Match object at 0x100418850>

Also forgot to mention, you should be using raw strings in your code

>>> x = 'one two three'
>>> y = re.search(r"\btwo\b", x)
>>> y
<_sre.SRE_Match object at 0x100418a58>
>>> 
pyfunc
Interesting, thanks for the working example. Do you have any insight as to why the method I chose doesn't work? The two approach should be the same, except that in your approach you are only compiling once.
darren
@darren: See my last example which just improves on what you did. I provided raw strings to search.
pyfunc
ahh after yours and Bolo's suggestion, it was because I wasn't using a raw string. Thanks!
darren
@darren: I provided this answer 13 minutes back :)
pyfunc
@pyfunc +1 for a nice answer, but I've decided to write a comment (and ultimately an answer) to distill the key point here: `"\b"` is not what @darren thought it is.
Bolo
-1: Backwards. The raw strings should be first. The other business of building an re expression with string `%` substitution is a bad tangent, irrelevant to this particular question.
S.Lott
+6  A: 

This will work: re.search(r"\btwo\b", x)

When you write "\b" in Python, it is a single character: "\x08". Either escape the backslash like this:

"\\b"

or write a raw string like this:

r"\b"
Bolo