views:

139

answers:

3

Using a regex in Python, how can I verify that a user's password is:

  • At least 8 characters
  • Must be restricted to, though does not specifically require any of:
    • uppercase letters: A-Z
    • lowercase letters: a-z
    • numbers: 0-9
    • any of the special characters: @#$%^&+=

Note, all the letter/number/special chars are optional. I only want to verify that the password is at least 8 chars in length and is restricted to a letter/number/special char. It's up to the user to pick a stronger / weaker password if they so choose. So far what I have is:

import re
pattern = "^.*(?=.{8,})(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=]).*$"
password = raw_input("Enter string to test: ")
result = re.findall(pattern, password)
if (result):
    print "Valid password"
else:
    print "Password not valid"
+5  A: 
import re
password = raw_input("Enter string to test: ")
if re.match(r'[A-Za-z0-9@#$%^&+=]{8,}', password):
    # match
else:
    # no match

The {8,} means "at least 8". The .match function requires the entire string to match the entire regex, not just a portion.

Amber
Ok, I have to admit that this is readable, and if it is not meant to change, then it is a good solution. A comment like "regex will not always be appropriate if business logic changes" might help, but then again - maintainers aren't dumb - you have to make that assumption, right?
Hamish Grubijan
@Hamish- I don't. Even if the maintainer is myself. Someone in the future will be assigned a change to that code and he will consider updating the regex as the "only" way to proceed for some period of time until he either comes up with some Rube Goldberg regex that works or he notices he's taking 3 days to make a quick change and finally ditches the regex structure.
jmucchiello
Well, I recently joined a company and fixed bugs for a few months. I've had a number of `WTF?` moments, which made me ask others questions and learn stuff. The usual explanation is: we wrote this 4/6/8/10 years ago, and this is how it was being done back then. I feel like since I passed the test of learning the messy system through fixing bugs, others should too. If a junior coder is having an easy time, then [s]he is either too smart for the group, or there is no learning. If you always work with hygienic code, then your "immune system" becomes whacked and/or starts to attack friendly code.
Hamish Grubijan
+2  A: 

Well, here is my non-regex solution (still needs some work):

#TODO: the initialization below is incomplete
hardCodedSetOfAllowedCharacters = set(c for c in '0123456789a...zA...Z~!@#$%^&*()_+')
def getPassword():
    password = raw_input("Enter string to test: ").strip()
    if (len(password) < 8):
        raise AppropriateError("password is too short")
    if any(passChar not in hardCodedSetOfAllowedCharacters for passChar in password):
        raise AppropriateError("password contains illegal characters")
    return password
Hamish Grubijan
tgray
+4  A: 

I agree with Hammish. Do not use a regex for this. Use discrete functions for each and every test and then call them in sequence. Next year when you want to require at least 2 Upper and 2 Lower case letters in the password you will not be happy with trying to modify that regex.

Another reason for this is to allow user configuration. Suppose you sell you program to someone who wants 12 character passwords. It's easier to modify a single function to handle system parameters than it is to modify a regex.

// pseudo-code
Bool PwdCheckLength(String pwd)
{
    Int minLen = getSystemParameter("MinPwdLen");
    return pwd.len() < minlen;
}
jmucchiello
One thing I am not sure about is: should the user be warned about just one type of error (invalid password), or all different kinds? Depends on the business logic, I suppose.
Hamish Grubijan
If he needs to change the password validity requirements later, it doesn't matter whether he used a regular expression or not *today*. It's just a function which returns `true` or `false` (at least it should be) and it doesn't matter whether it uses regular expressions or not.
Deniz Dogan
Depends on the audience. If the users are generally pre-validated (such as on a intranet app) you want to hold their hands and reduce support costs. In the wild, all this does is tell your potential attacker how to reduce the size of his dictionary. I think this is why you see those algorithms showing password strength on some web sites. They don't tell you what is wanted exactly but still enforce the rules in some way. The note to the user could just say "must be 60% or better". (Of course when those things are just javascript that doesn't stop the attacker from reading the code.)
jmucchiello
@Deniz - Maintenance of a bunch of little functions is infinitely easier than maintenance of a single complex regex. If you want to debate that, you live in a different programming world than I do.
jmucchiello
There are 10 kinds of people in this world: those who get Perl and those who do not ;)
Hamish Grubijan
@jmucchiello: I'm not sure what programming world you live in, but in the one I live in no one is afraid of regular expressions, especially not expressions as simple as this one. Should the validation logic get more complicated, he should rewrite it without depending on one single regular expression. I'm just saying that if the validation logic gets more complicated later, he will most likely still have to start over from scratch, independent of whether or not he used a regular expression to begin with.
Deniz Dogan
@Deniz - Why use something that would have to be rewritten completely in an area you just know the requirements will eventually change? Regexes are great in only two places: ad hoc throw away stuff (like command lines) and deep, lowlevel libraries where the requirements are as closed to fixed in stone as they can get. Anywhere in between is just asking for trouble down the road. Password validations are also wrong for regex because they almost never have order. Generally you want at least 1 upper and 1 lower but the order doesn't matter. As soon as you add a third at least 1 class....
jmucchiello
... the regex becomes overly complex. Since the field of password validations is generally wrong for regexes you should start with a regex just because the current requirements are simple enough for regex. Password requirements will change. It is nearly a fact. Why start with a tool you know you will need to replace once the complexity goes up?
jmucchiello
@jmucchiello: *IF* the requirements are this simple, then *yes*, he should go for a regular expression. *IF* the requirements are more complex, then he should go for something else. A programmer's job is *not* to predict future modifications to the original specification.
Deniz Dogan