I know that I can negate group of chars as in [^bar] but I need a regular expression where negation applies to the specific word - so in my example how do I negate an actual "bar" and not "any chars in bar"
You could either use a negative look-ahead or look-behind:
^(?!.*?bar).*
^(.(?<!bar))*?$
Or use just basics:
^(?:[^b]+|b(?:$|[^a]|a(?:$|[^r])))*$
These all match anything that does not contain bar
.
Unless performance is of utmost concern, it's often easier just to run your results through a second pass, skipping those that match the words you want to negate.
Regular expressions usually mean you're doing scripting or some sort of low-performance task anyway, so find a solution that is easy to read, easy to understand and easy to maintain.
If your language supports negative lookbehinds and/or negative lookaheads, you could do something like:
(?<!bar)foo # negative lookbehind
foo(?!bar) # negative lookahead
(?<!bar)foo(?!bar) # both together
EDIT: As I just realized, negative lookbehinds and lookaheads won't do it. You need positive ones:
UPDATE: improved regex:
^([^b]|b(?!ar)).*?(?=bar)|(?<=bar).+?(?=bar)|(?<=bar).+
Removed the last bit, as it caused more trouble than it solved (as it is now [ignoring the fact that I haven't yet fixed the repetition problem], you'll have to do a check to see if an all-non-matched string is either just a bunch of "bar"s or a string with no "bar"s). Still working on when "bar" is repeated without anything in between. (Beginning-of-the-string problem has now been fixed.)
UPDATE 2: The following regex will do what you want, matching things properly; the only problem is that it matches individual characters (i.e. each match is a single character rather than all characters between two consecutive "bar"s), possibly resulting in a potential for high overhead if you're working with very long strings.
b(?!ar)|(?<!b)a|a(?!r)|(?<!ba)r|[^bar]
A great way to do this is to use negative lookahead:
^(?!.*bar).*$
Just thought of something else that could be done. It's very different from my first answer, as it doesn't use regular expressions, so I decided to make a second answer post.
Use your language of choice's split()
method equivalent on the string with the word to negate as the argument for what to split on. An example using Python:
>>> text = 'barbarasdbarbar 1234egb ar bar32 sdfbaraadf'
>>> text.split('bar')
['', '', 'asd', '', ' 1234egb ar ', '32 sdf', 'aadf']
The nice thing about doing it this way, in Python at least (I don't remember if the functionality would be the same in, say, Visual Basic or Java), is that it lets you know indirectly when "bar" was repeated in the string due to the fact that the empty strings between "bar"s are included in the list of results (though the empty string at the beginning is due to there being a "bar" at the beginning of the string). If you don't want that, you can simply remove the empty strings from the list.
I came across this forum thread while trying to identify a regex for the following English statement: "Given an input string, match everying unless this input string is exactly 'bar'; for example I want to match 'barrier' and 'disbar' as well as 'foo'."
Here's the regex I came up with
^(bar.+|(?!bar).*)$
My English translation of the regex is "match the string if it starts with 'bar' and it has at least one other character, or if the string does not start with 'bar'.