views:

37

answers:

1

We have been using the following js/regex to find and replace all non-alphanumeric characters apart from - and +

outputString = outputString.replace(/[^\w|^\+|^-]*/g, "");

However it doesn't work entirely - it doesn't replace the ^ and | characters. I can't help but wonder if this is something to do with the ^ and | being used as meta-characters in the regex itself.

I've tried switching to use [\W|^+|^-], but that replaces the - and +. I thought that possibly a lookahead assertion may be the answer, but I'm not very sure how to implement them.

Has anyone got an idea how to accomplish this?

+5  A: 

Character classes do not do alternation, hence why the | is literal, and the ^ must be at the start of the class to take effect (otherwise it's treated literally.)

Use this:

[^\w+-]+

(Also, if - is not last, it needs to be escaped as \- inside a character class - so be careful if more characters might be added to the exception list).

You could also do it with a negative lookahead like this:

(?![+-])\W

Note: You do not want a * or + after that \W, since the lookahead only applies to the immediately following character (and the g flag makes the replace repeat until done).

Also note that \w and \W consider _ as a word character. If that's not desired then to replace that you can use (?![+-])[\W_] (or use explicit ranges in the first expressions).

Peter Boughton
Hi Peter, thanks - that's great. <code>[^\w+-]+</code> worked just great! And thanks for the additional information - very helpful.
Findel_Netring