tags:

views:

42

answers:

3

How would I make a regular expression to match the character '<' not followed by ('a' or 'em' or 'strong')

so <hello and <string would match, but <strong wouldn't.

UPDATE: Btw, the language I'm using is javascript

+3  A: 

Try this:

<(?!a|em|strong)
Andrew Hare
+1 I think that does it for Perl-compatible regexp syntax. (For other syntaxes, it might be different)
David Zaslavsky
Tried it and it works good, thanks
Kyle
A: 

You use a negative lookahead, the simplest form for which is (for this problem):

<(?|a|em|strong)

The one issue with that is that it will ignore <applet>. The way you deal with that is by using \b, which is a zero-width expression (meaning it captures none of the input) that matches a word to non-word or non-word to word transition. Word characters are [0-9a-zA-Z_]. So:

<(?|(a|em|strong)\b)
cletus
I think you meant `<(?!(a|em|strong)\b)`.
Andrew Hare
+1  A: 

If your regex engine supports it, use a negative lookahead assertion: this looks ahead in the string, and succeeds if it wouldn't match; however, it doesn't consume any input. Thus, you want /<(?!(?:a|em|strong)\b)/: match a <, then succeed if there isn't an a, em, or strong followed by a word break, \b.

Antal S-Z
Interesting point about the word break, although the OP didn't say whether that was desired
David Zaslavsky