tags:

views:

32

answers:

2

I'm using a simple regular expression to match on the start of words, using the word boundary matcher, like

/(\b)rice/

will match on "years of rice and salt" but not "maurice ravel" and so on.

However, I'm finding a ! at the start of the string is negating the word boundary matcher. So the string "!!" is matching on "some text!!".

Anyone know why this would be happening? Haven't seen that it's a special character.

+2  A: 

There is a word boundary between t and ! because t is a word character and ! is not a word character. There is nothing special about ! apart from you assumed it was a word character, but it is not.

Since you are not dealing with "words" the word boundary is not what you want. Instead you could use a lookbehind assertion and check if the previous character is whitespace, start of line, or any other character you wish to allow as your separator. Note that not all regex engines support lookbehind assertions.

Mark Byers
A: 

Is this the second regex:?

/\b!!/

If so, then it should match the '!!' in 'some text!!' because there is a word boundary after the second 't' and before the first '!'.

(If no, how are we supposed to guess?)

Jonathan Leffler
Yes, that would be the second regex. Any regex starting with a ! is what I was asking about.
johnnymire