views:

60

answers:

3

I need a regular expression that will allow only a to z and 0 to 9. I came across the function below on this site, but it allows a few symbols thru (#.-). How should it be done if it has to allow only a to z (both upper and lower case) and 0 to 9? I'm scared to edit it since I know nothing about regular expressions.

Also is this regular expression good to check for a to z and 0 to 9, or is there any way it can be bettered.

function isValid($str) {
    return !preg_match('/[^A-Za-z0-9.#\\-$]/', $str);
}

Thanks

+2  A: 

The following seems to be what you need in this case:

function isValid($str) {
    return !preg_match('/[^A-Za-z0-9]/', $str);
}

The […] regex construct is called a character class. Something like [aeiou] matches one of any of the vowels.

The [^…] is a negated character class, so [^aeiou] matches one of anything but the vowels (which includes consonants, digits, symbols, etc).

The -, depending on where/how it appears in a character class definition, is a range definition, so 0-9 is the same as 0123456789.

Thus, the regex [^A-Za-z0-9] actually matches a character that's neither a letter nor a digit. This is why the result of preg_match is negated with !.

That is, the logic of the above method uses double negation:

isValid = it's not the case that
              there's something other than a letter or a digit
                  anywhere in the string

You can alternatively get rid of the double negation and use something like this:

function isValid($str) {
    return preg_match('/^[A-Za-z0-9]*$/', $str);
}

Now there's no negation. The ^ and $ are the beginning and of the string anchors, and * is a zero-or-one-of repetition metacharacter. Now the logic is simply:

isValid = the entire string from beginning to end
              is a sequence of letters and digits

References

Related questions


Non-regex alternative

Some languages have standard functions/idiomatic ways to validate that a string consists of only alphanumeric characters (among other possible string "types").

In PHP, for example, you can use ctype_alnum.

bool ctype_alnum ( string $text )

Checks if all of the characters in the provided string , text, are alphanumeric.

API links

  • PHP Ctype Functions - list of entire family of ctype functions
    • ctype_alpha, digit, lower, upper, space, etc
polygenelubricants
Thanks for the correction and also for pointing me to ctype_alnum. I never knew php had a function like that. I think i'll use that from now on.
Norman
@polygenelubricants: +1 You are an expert at regex, would you please check the regex here in my answer: http://stackoverflow.com/questions/3381331/jquery-convert-br-and-br-and-p-and-such-to-new-line/3381470#3381470
Sarfraz
A: 

You can match z and 0-9 with [Zz0-9] and you can match a-z and 0-9 with [a-z0-9]. If you want both upper and lower case then you would use [A-Za-z0-9].

See regular expression character classes for more on this.

Further, the !preg_match() isn't really necessary. Instead you could use a positive match on what you want, such as return preg_match('/^[A-Za-z0-9]+$/', $str); The one you have is actually a negated character class, so it will disallow anything within the brackets. I may be misunderstanding your purpose, though.

eldarerathis
Your modification looks like it may only work for single character strings.
JGB146
Whoops, that was a typo. Thanks, fixed it.
eldarerathis
This works well too.
Norman
+1  A: 

Whilst I have nothing against regular expressions, with such a simple pattern you should probably consider using

if(ctype_alnum($input)) {

http://uk3.php.net/manual/en/function.ctype-alnum.php

Cags
Just discovered this function here :-) It makes things real easy. Will be using this from now on.
Norman