views:

450

answers:

3

I am stumped trying to create an Emacs regular-expression that excludes groups. [^] excludes individual characters in a set, but I want to exclude specific sequences of characters: something like [^(not|this)], so that strings containing "not" or "this" are not matched.

In principle, I could write ([^n][^o][^t]|[^...]), but is there another way that's cleaner?

+3  A: 

This is not easily possible. Regular expressions are designed to match things, and this is all they can do.

First off: [^] does not designate an "excludes group", it designates a negated character class. Character classes do not support grouping in any form or shape. They support single characters (and, for convenience, character ranges). Your try [^(not|this)] is 100% equivalent to [^)(|hinots], as far as the regex engine is concerned.

Three ways can lead out of this situation:

  1. match (not|this) and exclude any matches with the help of the environment you are in (negate match results)
  2. use negative look-ahead, if supported by your regex engine and feasible in the situation
  3. rewrite the expression so it can match: see a similar question I asked earlier
Tomalak
+7  A: 

First of all: [^n][^o][^t] is not a solution. This would also exclude words like nil ([^n] does not match), bob ([^o] does not match) or cat ([^t] does not match).

But it is possible to build a regular expression with basic syntax that does match strings that neither contain not nor this:

^([^nt]|n($|[^o]|o($|[^t]))|t($|[^h]|h($|[^i]|i($|[^s]))))*$

The pattern of this regular expression is to allow any character that is not the first character of the words or only prefixes of the words but not the whole words.

Gumbo
+1, and if I was ever tempted to switch to Emacs, this would be reason enough not to. How can anyone *live* without lookaheads? :P
Alan Moore
+1  A: 

Try M-x flush-lines.

offby1