views:

113

answers:

1

I'm trying to create a regex for checking values of submitted tags for a free form folksonomy system. Here is what I have now.

if (!preg_match('/([^-\\a-zA-Z0-9._@\'])+/',$proposedtag)) {
    //true, good
    return true;
} else {
    //false, bad characters
    return false;
}

I want to allow: hyphen, backslash, forward slash, a-z, A-Z, 0-9, period, underscore, at sign, and single quote mark, and disallow all others.

I'm pretty sure a negated character class is the way to go on this...

However my code above seems to allow other characters (such as +), and I'm not sure why. Also as a sidenote, I'm not sure if I'm making sure I don't inadvertently allow SQL injections. Any tips?

+2  A: 

I believe it's an escaping issue with the backslash characters inside your character class. Try this instead, it seems to work better on the tests I fed it. Note the double-escaping on the backslashes (which I moved to the end):

if (!preg_match('/([^\-a-zA-Z0-9._@\'\\\\])+/',$proposedtag)) {
zombat
This is working for me so far. However I did modify it to this: '/([^\-\/a-zA-Z0-9\._@\'\\\\])+/' to include the forward slash.
jjclarkson
I'll run a few more tests before I hit the check mark ;)
jjclarkson