



I have the following regex:


So allow A-Z, a-Z, 0-9, and these special chars '.,&@:?!()$#/\

I want to NOT match if the following set of chars is encountered anywhere in the string in this order:


When I run this regex with just "&#" as input, it does not match my pattern, I get an error, great. When I run the regex with '.,&@:?!()$#/\ABC123 It does match my pattern, no errors.

However when I run it with:


It does not error either. I'm doing something wrong with the check for the &# sequence.

Can someone tell me what I've don wrong, I'm not great with these things.


+1  A: 

I would actually do it in two parts:

  1. Check your allowed character set. To do this I would look for characters that are not allowed, and return false if there's a match. That means I have a nice simple expression:
  2. Check your banned substring. And since it is just a substring, I probably wouldn't even use a regex for that part.

You didn't mention your language, but if in C#:

bool IsValid(string input)
    return !(   input.Contains("&#")  
               || Regex.IsMatch(@"[^A-Za-z0-9'\.&@:?!()$#^]", input) 
Joel Coehoorn
yeah I agree, and that's how I'd do it normally, but see below.
John Batdorf

I'm doing it in c# but the problem is I'm working against an SDK that takes the regex as a value I pass in from a config file, so I don't have the codebehind love. :) So I have to try and do it in one shot? I thought this was possible?

John Batdorf

Assuming Perl compatible RegExp

To not match on the string '&#':


Although you don't need the parenthesis because you are matching the entire string.



note that the last \ is escaped (doubled) SO automatically turns \\ into \ if not in backticks

+3  A: 

Borrowing a technique for matching quoted strings, remove & from your character class, add an alternative for & not followed by #, and allow the string to optionally end with &:


Ben Blank
BAM! You're right on the money. Thank you so much.
John Batdorf

I'd recommend using two regular expressions in a conditional:

if (string has sequence "&#")
     return false
     return (string matches sequence "A-Za-z0-9-'.,&@:?!()$#/\")

I believe your second "main" regex of


has several errors:

  • It will test only one character in your set
  • The '\' character in regular expressions is a token indicating that the next character is part of some sort of "class" of characters (ex. '\n' = is the line feed character). The character sequence ']' is actually causing your bracketed list not to be terminated.

You may be better off using


Note that the slash character is represented by a double-slash.

The "+" character indicates that at least one character being tested has to match the regex; if it is fine to pass a zero-length string, replace the '+' with a '*'.

Perry Pederson
The errors you pointed out weren't entirely the OP's fault. The forum software ate a couple of asterisks and a backslash. That's what happens when you try to talk about regexes without code-ifying them.
Alan Moore
By the way, if John really had accidentally escaped the closing square bracket, the regex wouldn't even have compiled.
Alan Moore

Just FYI, although Ben Blank's regex works, it's more complicated than it needs to be. I would do it like this:


Because I used a negative lookahead instead of a negated character class, the regex doesn't need any extra help to match an ampersand at the end of the string.

Alan Moore