tags:

views:

162

answers:

4

I really know very little about regex's.
I'm trying to test a password validation.

Here's the regex that describes it (I didn't write it, and don't know what it means):

private static string passwordField = "[^A-Za-z0-9_.\\-!@#$%^&*()=+;:'\"|~`<>?\\/{}]";  

I've tried a password like "dfgbrk*", and my code, using the above regex, allowed it.
Is this consistent with what the regex defines as acceptable, or is it a problem with my code?

Can you give me an example of a string that validation using the above regex isn't suppose to allow?

Added: Here's how the original code uses this regex (and it works there):

public static bool ValidateTextExp(string regexp, string sText)
            {
                if ( sText == null)
                {
                    Log.WriteWarning("ValidateTextExp got null text to validate against regExp {0} . returning false",regexp);
                    return false;
                }

                return (!Regex.IsMatch(sText, regexp));
            }

It seems I'm doing something wrong..

Thanks.

+5  A: 
Joel Coehoorn
There is also a character class negation.
codaddict
@codaddict - the expression would also negate all alphanumeric characters, so I'm pretty sure it's just a double negative here - `Does the expression match any invalid characters?`
Joel Coehoorn
But the regex starts with negation of the character class `[^A-Za-z`... so how does it accept his "dfgbrk*" at all? I know regex pretty well and I'm confused.
Stephen P
@Stephen - the password is accepted only if the regex _fails_ to match. It's trying to match _invalid_ characters, or anything **not** in that approved list, where a successful match means the password is bad.
Joel Coehoorn
@Stephen P : It appears to be matching because he has spaces in his value.
Andrew Barber
@Joel Coehoorn; btw, I would have +1'd your answer if I had any votes left! Hopefully I'll remember to do so.
Andrew Barber
@Joel : Thanks, now that the code is posted I see that.
Stephen P
+4  A: 

You regular expression is just an inverted character class and describes just one single character (but that can’t be *). So it depends on how you use that character class.

Gumbo
+8  A: 

Your regex matches a value that contains any single character which is not in that list.

Your test value matches because it has spaces in it, which do not appear to be in your expression.

The reason it's not is because your character class starts with ^. The reason it matches any value that contains any single character that is not that is because you did not specify the beginning or end of the string, or any quantifiers.

The above assumes I'm not missing the importance of any of the characters in the middle of the character soup :)

This answer is also dependent on how you actually use the Regex in code.


If your intention was for that Regex string to represent the only characters that are actually allowed in a password, you would change the regex like so:

string pattern = "^[A-Z0-9...etc...]+$";

The important parts there are:

  • The ^ has been removed from inside the bracket, to outside; where it signifies the start of the whole string.
  • The $ has been added to the end, where it signifies the end of the whole string.
  • Those are needed because otherwise, your pattern will match anything that contains the valid values anywhere inside - even if invalid values are also present.
  • finally, I've added the + quantifier, which means you want to find any one of those valid characters, one or more times. (this regex would not permit a 0-length password)

If you wanted to permit the ^ character also as part of the password, you would add it back in between the brackets, but just *not as the first thing right after the opening bracket [. So for example:

string pattern = "^[A-Z0-9^...etc...]+$";

The ^ has special meaning in different places at different times in Regexes.

Andrew Barber
Yeah, they really should add a space to that expression. Disallowing spaces is bad. I sometimes like to use a trailing space or two in my passwords in case it's ever accidentally printed out somewhere.
Joel Coehoorn
Spaces are great in passwords :)
Andrew Barber
#Andrew : Thanks a lot, I'm quite sure now I'm not using it right in code, so I just want to know what it means, so that I'll know what to expect my code to return (validate or not..)
Oren A
So, what are you trying to actually do with your Regex? Is that list supposed to be a list of all the characters which *are* allowed in a password? So, you only want to allow those characters and nothing else?
Andrew Barber
@Andrew - Thanks a lot. Bottom line - I was trying to use it differently than how it's meant to be used (and your explanation made me realize that). I think it's just that that field has a (very) bad name (should have been something like - passwordUnallowedCharacters)
Oren A
You're very welcome; Regular Expressions are themselves mis-named; they don't often seem 'regular' or 'expressive' :P and then it's a little more complicated by needing also to understand the language API to use an expression, too.
Andrew Barber
+2  A: 

Depends on how you apply it. It describes exactly one character, however, the ^ in the beginning buggs me a little, as it prohibits every other character, so there is probably something terribly fishy there.

Edit: as pointed out in other answers, the reason for your string to match is the space, not the explanation that was replaced by this line.

bitmask