ansaurus

Question

Answer 1

+3 A:

You could either use a negative look-ahead or look-behind:

^(?!.*?bar).*
^(.(?<!bar))*?$

Or use just basics:

^(?:[^b]+|b(?:$|[^a]|a(?:$|[^r])))*$

These all match anything that does not contain bar.

Gumbo 2009-08-06 17:24:42

What languages don't support (negative) look-behinds and/or (negative) look-aheads in regex?

JAB 2009-08-06 17:29:07

I think the point being made is, looking at your pattern it's not at all clear that all you're doing is rejecting the word "bar".

Bryan Oakley 2009-08-06 17:34:25

@Bryan: And, in fact, it doesn't reject the word "bar". It just rejects "b" when followed by "ar".

JAB 2009-08-06 18:05:28

Answer 2

+1 A:

Unless performance is of utmost concern, it's often easier just to run your results through a second pass, skipping those that match the words you want to negate.

Regular expressions usually mean you're doing scripting or some sort of low-performance task anyway, so find a solution that is easy to read, easy to understand and easy to maintain.

Bryan Oakley 2009-08-06 17:33:13

Answer 3

+5 A:

If your language supports negative lookbehinds and/or negative lookaheads, you could do something like:

(?<!bar)foo        # negative lookbehind
foo(?!bar)          # negative lookahead
(?<!bar)foo(?!bar)  # both together

EDIT: As I just realized, negative lookbehinds and lookaheads won't do it. You need positive ones:

UPDATE: improved regex:

^([^b]|b(?!ar)).*?(?=bar)|(?<=bar).+?(?=bar)|(?<=bar).+

Removed the last bit, as it caused more trouble than it solved (as it is now [ignoring the fact that I haven't yet fixed the repetition problem], you'll have to do a check to see if an all-non-matched string is either just a bunch of "bar"s or a string with no "bar"s). Still working on when "bar" is repeated without anything in between. (Beginning-of-the-string problem has now been fixed.)

UPDATE 2: The following regex will do what you want, matching things properly; the only problem is that it matches individual characters (i.e. each match is a single character rather than all characters between two consecutive "bar"s), possibly resulting in a potential for high overhead if you're working with very long strings.

b(?!ar)|(?<!b)a|a(?!r)|(?<!ba)r|[^bar]

JAB 2009-08-06 17:33:52

Answer 4

+1 A:

A great way to do this is to use negative lookahead:

^(?!.*bar).*$

Chris Pebble 2009-08-06 17:38:49

This says it all (I probably would have started with (?!bar) and built up). I don't see why other people are making it so complicated.

Beta 2009-08-07 14:49:44

Unfortunately, this doesn't work with all languages.

JAB 2009-08-07 18:01:24

Answer 5

A:

Just thought of something else that could be done. It's very different from my first answer, as it doesn't use regular expressions, so I decided to make a second answer post.

Use your language of choice's split() method equivalent on the string with the word to negate as the argument for what to split on. An example using Python:

>>> text = 'barbarasdbarbar 1234egb ar bar32 sdfbaraadf'
>>> text.split('bar')
['', '', 'asd', '', ' 1234egb ar ', '32 sdf', 'aadf']

The nice thing about doing it this way, in Python at least (I don't remember if the functionality would be the same in, say, Visual Basic or Java), is that it lets you know indirectly when "bar" was repeated in the string due to the fact that the empty strings between "bar"s are included in the list of results (though the empty string at the beginning is due to there being a "bar" at the beginning of the string). If you don't want that, you can simply remove the empty strings from the list.

JAB 2009-08-07 19:58:34

Answer 6

A:

I came across this forum thread while trying to identify a regex for the following English statement: "Given an input string, match everying unless this input string is exactly 'bar'; for example I want to match 'barrier' and 'disbar' as well as 'foo'."

Here's the regex I came up with

^(bar.+|(?!bar).*)$

My English translation of the regex is "match the string if it starts with 'bar' and it has at least one other character, or if the string does not start with 'bar'.

ReQuest Programmer 2010-09-10 20:44:58

@ReReqest - you will have much better chance to have this question answered if you post it as a separate question. In that you can provide link back to this question if you want. For the substance of question - it looks OK but I'm no regex guru

DroidIn.net 2010-09-11 17:47:16

ansaurus

tags:

views:

answers:

How to negate specific word in regex?

related questions