views:

47

answers:

2

Hi, I have a regex like the following:

.{0,1000}(?!(xa7|para(graf))$)

using Java. I was expecting that it would cause the following text to fail:

blaparagraf

because paragraf is found at the end

+3  A: 

That's because .{0,1000} will match the entire subject, hence it's not followed by xa7 or paragraf (it's followed by $ only).

You want negative lookbehind:

.{0,1000}(?<!xa7|paragraf)$
Artefacto
Ok Thanks, as a curiosity would it be possible to change it to use negative lookahead?
@bryan-rasmussen I initially tried to come up with something with negative lookahead, with no success.
Artefacto
+1  A: 

It is a common a mistake to misplace assertions. If you want to use lookahead, the pattern is something like this:

^(?!.*paragraph$).*$

This matches (as seen on rubular.com):

something something para
paragraph something something

But doesn't match:

something paragraph

So the key difference here is that we start looking ahead at the beginning of the string, before we match .* (or .{0,1000} in your case). Of course, what we're looking for isn't simply paragraph$, but rather .*paragraph$.

That said, to check that a string doesn't end with something of finite length, lookbehind when supported is the most natural solution.

^.*$(?<!paragraph)
polygenelubricants
+1 nice, that didn't occur to me.
Artefacto