tags:

views:

168

answers:

4

I want a regular expression which ALLOWS only this:

 letter a-z
 case insensitive
 allows underscores
 allows any nrs

How should this be written?

Thanks

+7  A: 

That would be

\w

if I'm not mistaken (As it turns out, it depends: In PHP the meaning of \w changes with the locale that's currently in effect). You can use a more explicit form to nail it down:

[A-Za-z0-9_]

I use it in context, add start-of-string and end-of-string anchors and a quantifier that defines how many characters you will allow:

^[A-Za-z0-9_]+$
Tomalak
A: 
if(preg_match('/^[0-9a-z_]+$/i', $string)) {
  //if it matches
}

else {
  //if it doesn't match
}

[0-9a-z_] is a character class that defines the digits 0 through 9, the letters a through z and the underscore. The i at the end makes the match case-insensitive. ^ and $ are anchors that match the beginning and end of the string respectively. The + means 1 or more characters.

Vivin Paliath
Wrong. This will return a false positive if the input happens to contain a valid character - even if it also contains invalid characters.
Matt
Good point - I'm editing my response.
Vivin Paliath
This wouldn't work. "a b c d" would pass, as you're only looking for one of the characters in the string to match. Modify the pattern to `/^[0-9a-z_]$/i` and it'll work.
ceejayoz
@ceejayox That would only match one character - I added a `+` for 1 or more.
Vivin Paliath
+5  A: 

PHP:

if (preg_match('/[^a-z0-9_]/i', $input)) {
  // invalid input
} else {
  // valid input
}

So [a-z0-9_] is a character set for your valid characters. Adding a ^ to the front ([^a-z0-9_]) negates it. The logic is, if any character matches something that ISN'T in the valid character set, the input is considered invalid.

The /i at the end makes the match case insensitive.

Matt
+1 - Some times the easier regexp to test/read is the negation. Rather than allow only certain characters, just test for the existence of an invalid character.
gnarf
But when the pattern is the same (bar negation), it's only a matter of inverting the if/else blocks, or putting a `!` before `preg_match`. Negating makes it easier when you change the problem, not when you write the same exact pattern in a slightly different way.
kemp
+1  A: 

How should it be written? (breaking it into multiple lines)

/           # Start RegExp Pattern
 ^          # Match beginning of string only
 [a-z0-9_]* # Match characters in the set [ a-z, 0-9 and _ ] * = Zero or more times
 $          # Match end of string
/i          # End Pattern - Case Insensitive Matching

Giving you

if (preg_match('/^[a-z0-9_]*$/i', $input)) {
  // input is valid
}

You could also use a + instead of * if you want to force at least one character as well.

gnarf
Isn't this case-sensitive? Your code that is (not the explanation!)
Vivin Paliath
@Vivin Paliath - Yup! Thanks for the gentle nudge.
gnarf
It's ok, no problem! :)
Vivin Paliath