tags:

views:

53

answers:

2

Hi,

I need to clean a string and make sure it contains alpha numeric characters only. I've come up with the following code that for some reason fails

    $string = 'aaa`bbb!!';
    $string = preg_replace("#[^a-zA-z0-9]*#", "", $string);
    echo $string;die;   

The output I receive is aaa`bbb while I expect aaabbb. Could you please help me with this.

+10  A: 

It should be a capital Z:

preg_replace("#[^a-zA-Z0-9]*#", "", $string);

When you write A-z it means all the characters between A (ASCII value 65) and z (ASCII value 122). This includes backtick (ASCII value 96) plus a few others that you didn't want (underscore, square brackets, backslash and tilde).

You can also use a + instead of a * to save repeatedly replacing the empty string with the empty string.

Mark Byers
+1 for Eagle Eyes on the `z`
Jason McCreary
oh ops, I hate such typos! Thanks for pointing it out!
Eugene
Or use the i flag for case-insensitive: preg_replace("#[^a-z0-9]*#i", "", $string);
Mark Baker
+3  A: 

I think the * is unnecessary and you could simplify with \W. Just try the following:

$string = preg_replace("/[\W_]/", "", $string);

Also, if you merely want to validate - check out ctype_alnum. It avoids the overhead of the RegEx library.

Jason McCreary
This currently allows underscore - can remove that also with either `\W|_` or `[\W_]` as the expression.
Peter Boughton
@Peter. Good point about the underscore. I have updated my expression.
Jason McCreary