views:

73

answers:

2

I need to match strings that don't contain a keyword (beta2) at an arbitrary position.

Consider:

var aStr    = [
                '/beta1/foo',
                'beta1/foo',
                '/beta1_xyz/foo',
                'blahBlah/beta1/foo',
                'beta1',

                '/beta2/foo',
                'beta2/foo',
                '/beta2_xyz/foo',
                'blahBlah/beta2/foo',
                'beta2',

                '/beat2/foo',
                'beat2/foo',
                '/beat2_xyz/foo',
                'blahBlah/beat2/foo',
                'beat2'
            ];

function bTestForMatch (Str)
{
    return /.*\b(?!beta2).*/i.test (Str);
}

for (var J=0, iLen=aStr.length;  J < iLen;  J++)
{
    console.log (J, bTestForMatch (aStr[J]), aStr[J]);
}

We need a regex that matches all strings that exclude beta2. beta2 will always start at a word boundary, but not necessarily end at one. It can be at a variety of positions in the string.

The desired results would be:

 0    true    /beta1/foo
 1    true    beta1/foo
 2    true    /beta1_xyz/foo
 3    true    blahBlah/beta1/foo
 4    true    beta1
 5    false   /beta2/foo
 6    false   beta2/foo
 7    false   /beta2_xyz/foo
 8    false   blahBlah/beta2/foo
 9    false   beta2
10    true    /beat2/foo
11    true    beat2/foo
12    true    /beat2_xyz/foo
13    true    blahBlah/beat2/foo
14    true    beat2

The regex is for a 3rd-party analysis tool that takes JavaScript regular expressions to filter sub-results. The tool takes a single line of regex. There is no API and we don't have access to its source-code.

Is there a JavaScript regex that will filter the second beta results (beta2) from this analysis run?

+2  A: 

Try

/^(?!.*beta2).*$/
KennyTM
That looks good, so far! Give me a day to run it buy the team and make sure we didn't overlook something.
Brock Adams
Perhaps you'd want to surround the keyword with `\b` anchors to avoid false positives on words like `beta20` or `alphabeta2`.
Tim Pietzcker
@Tim: Maybe a `\b` in the front, but because of Case 7 it cannot appear in the back (something more complicated, like `(?![^a-z])` is needed).
KennyTM
@KennyTM: Right, I hadn't noticed Case 7. I'd suggest `(?![^\W_])` then. But the regex is getting really ugly: `/^(?!.*(?<![^\W_])beta2(?![^\W_])).*$/`
Tim Pietzcker
JavaScript doesn't support lookbehinds. Anyway, I think you're making this more complicated than it needs to be. I would anchor the front end with `\b`, though I'm not sure it's really needed. As for the back end, I wouldn't bother trying to anchor it. The only information we have about it is a negative: that it won't necessarily fall on a word boundary.
Alan Moore
As stated in the question, `beta2` always *starts* on a word boundary. It also can be mixed case. So we tweaked this solution slightly to: `/^(?!.*\bbeta2).*$/i`. Thanks, everybody.
Brock Adams
A: 

Would this be considered cheating?

return !/beta2/i.test (Str);
paque
That doesn't work as the `!` operator is outside the regex. We'd tried that; the tool errors-out on anything but pure regex.
Brock Adams
=)It works because the ! is outside the regex (bTestForMatch returns what you want it to). However KennyTM's regexp seems to be what you asked for.
paque