views:

1530

answers:

4

On my registration page I need to validate the usernames as alphanumeric only, but also with optional underscores. I've come up with this:

function validate_alphanumeric_underscore($str) 
{
    return preg_match('/^\w+$/',$str);
}

Which seems to work okay, but I'm not a regex expert! Does anyone spot any problem?

+11  A: 

The actual matched characters of \w depend on the locale that is being used:

A "word" character is any letter or digit or the underscore character, that is, any character which can be part of a Perl "word". The definition of letters and digits is controlled by PCRE's character tables, and may vary if locale-specific matching is taking place. For example, in the "fr" (French) locale, some character codes greater than 128 are used for accented letters, and these are matched by \w.

So you should better explicitly specify what characters you want to allow:

/^[A-Za-z0-9_]+$/

This allows just alphanumeric characters and the underscore.

And if you want to allow underscore only as concatenation character and want to force that the username must start with a alphabet character:

/^[A-Za-z][A-Za-z0-9]*(?:_[A-Za-z0-9]+)*$/
Gumbo
Why? \w is exactly alphanumeric characters and underscore.
Lucas Oman
Actually, \w is alphanumeric characters and underscore according to the active locale, so depending on the circumstances, it might match characters like 'ü' or 'ö'.
af
@af, ah, you're right. Thanks for the correction!
Lucas Oman
Thanks everyone!Gumbo's extra options are useful so I'll go with that. Cheers!
+1  A: 

try

function validate_alphanumeric_underscore($str) 
{
    return preg_match('/^[a-zA-Z0-9_]+$/',$str);
}
Phill Pafford
+1  A: 

Looks fine to me. Note that you make no requirement for the placement of the underscore, so "username_" and "___username" would both pass.

Lucas Oman
A: 

Your own solution is perfectly fine.

preg_match uses Perl-like regular expressions, in which the character class \w defined to match exactly what you need:

\w - Match a "word" character (alphanumeric plus "_")

(source)

Sebastian P.