views:

54

answers:

5

I'm adding an if statement to my database abstraction layer to pick out any attempted queries to a handful of database tables.

Basically, if my application attempts to create, read or destroy data from a database table called either members or members_profiles I want to invoke my if statement.

if (
    preg_match('/INSERT INTO [members|members_profiles]/', $sql) ||
    preg_match('/UPDATE [members|members_profiles]/', $sql) ||
    preg_match('/DELETE FROM [members|members_profiles]/', $sql))
{
    // do if statement stuff here...
}

I'm no regular expression/preg-match master, but will the above if statement return true if a SQL query matches:

  • INSERT INTO members ... or INSERT INTO members_profiles ...
  • UPDATE members ... or UPDATE members_profiles ...
  • DELETE FROM members ... or DELETE FROM members_profiles ...

Or is my preg-match syntax way off?

+6  A: 

The syntax is valid but won't do what you want.

For alternation of patterns, use

preg_match('/INSERT INTO (?:members|members_profiles)/', $sql) ||
//                       ^^^                        ^

In PCRE, […] defines a character class. This will match 1 character, given it appears inside the brackets. In your case [members|members_profiles] will match 1 character if it is one of b, e, f, i, l, m, o, p, r, s, | or _.

Grouping is done using another kind of parenthesis, (?:…).


(BTW, please don't use Regex to detect database changing attempts. Restrict the permissions of the database user instead.)

KennyTM
This probably wants a `\b` at the end of the group, and would benefit from the spaces being `\s*` - although your point about not using regex for permissions is of course valid.
Peter Boughton
I'm not using the above to restrict database access, but more so to catch any references to soon-to-be deprecated tables as we're currently rolling out a new environment. This will be used on a testing environment for a week or two to check all queries have been updated in our application's models.
Martin Bean
A: 

This is correct

if (
    preg_match('/INSERT\sINTO\s(members|members_profiles)/', $sql) ||
    preg_match('/UPDATE\s(members|members_profiles)/', $sql) ||
    preg_match('/DELETE\sFROM\s(members|members_profiles)/', $sql))
{
    // do if statement stuff here...
}

LE: Changed white space escaping with white space(s) special character

Bogdan Constantinescu
No need to escape the white space.
Bart Kiers
Would you be able to explain what the backslash character does in `preg-match` calls?
Martin Bean
Martin, the backslash is used to escape characters - either ones that have special meaning in regex, to get a literal one, or ones that are literal to give special meaning. (e.g. use `\\(` or `\|` to get literal `(` or `|` characters, whilst `\w` gives special meaning of "word characters", etc) In this situation, the backslash isn't needed.
Peter Boughton
Better be safe than sorry. I always escape special characters, including white space.
Bogdan Constantinescu
Better to be correct when teaching others! The space character is not a special character.
Peter Boughton
Fair enough. I knew the backslash was to escape characters (such as slashes themselves, which can begin to look messy!) but couldn't see why it were being applied above.
Martin Bean
+2  A: 

The regex

[ab|cd]

does not match either ab OR cd but matches one of the following single (!) characters: a, b, |, c or d.

What you need is a group instead of a character class:

(ab|cd)

which does match either ab OR cd.

More info on character classes: http://www.regular-expressions.info/charclass.html

Bart Kiers
+2  A: 

Just one regex will be enough:

if (preg_match('/(?i)(INSERT\s+INTO|UPDATE|DELETE\s+FROM)(?-i)\s+(members|members_profiles)/', $sql))
{
    // do if statement stuff here...
}

note the () instead of [] and \s+ instead of spaces. because SQL is valid with any number of nay whitespace, and \s+ matches them all. The + behind \s means that it must be at least one whitespace, but it can be more. The (?i) means that all following characters will be checked case insensitive, and the (?-i) turn case sensitivity on again. Since SQL commands are case insensitive.

(abc|def) matches abc or def
[abc|def] matches a, b, c, |, d, e or f

To try out a regular expression you can use http://rubular.com/, it says it's written in ruby for testing regex in ruby, but valid regex should be independent of language, so it can be used to test common regular expressions, that work in other languages, too.

jigfox
A: 

Apart the errors already reported by other users, your regular expression is too restrictive, if you need to detect when a query is trying to alter the tables members, and members_profiles.

I would not use such approach, but if you need to really do that then consider that

  • The database engine allows comments. If the database engine is MySQL, I can write INSERT INTO /* little trick */ members …, or INSERT /* into */ INTO /* table */ members ….
  • The database engine allows any number of spaces, or new line characters.
kiamlaluno
It's only for internal queries; queries coming from the application's models, and not user-generated/user-submitted queries.
Martin Bean