ansaurus

Question

Best Approach to Parse for SQL in PHP Files?

Answer 1

+1 A:

Your solution seems fine to me. The other way would be to parse the PHP file with a Lex/Yacc parser using the grammar for PHP, there is one good C# grammar parsing tool called Coco/R http://www.ssw.uni-linz.ac.at/coco/.

However I believe if you do parse the language, you will end up consuming too much time (in development and in computing) for no additional results.

I would stick with your opportunistic approach, but test it against various PHP code and tweak it to cover all possible cases.

Vincent 2008-08-21 16:49:51

Answer 2

+1 A:

Maybe theres some milage in parsing text lines against the BNF for, say, SQL92, and scoring each line on how closely the fragments match the grammar.

Sounds like some heavy lifting though. Your simple approach will catch such a large percentage of real-world cases already.

Cheekysoft 2008-08-21 16:53:00

Answer 3

+1 A:

I do not know the specifics of variables in C# so you will have to forgive or down-vote me for using PHP but 70% of the time my SQL query goes into a variable like so

$sql = "SELECT * FROM table;";

Beyond that I am unable to think of anything you can do to improve on what you already have.

Do you take into account statements that are created over several lines and use variables within the string? (Example below)

$sql = "SELECT * FROM table WHERE fname = $fname OR snmae = $sname";

Teifion 2008-08-21 16:54:27

Answer 4

A:

I do not know the specifics of variables in C# so you will have to forgive or down-vote me for using PHP but 70% of the time my SQL query goes into a variable like so ..

Yeah, my original approach was to just look for the $sql vars since that is what most people use, but after testing against a few PHP apps I quickly threw that solution out because some developers use some funky variable names ...

Do you take into account statements that are created over several lines and use variables within the string? (Example below)

Yep. I also attempted to handle statements that were generated conditionally, but that didn't always work so well. ;)

Magic Hat 2008-08-21 17:06:48

Answer 5

A:

A simple regex to detect all CRUD sql statements used with functions (assuming $script contains the whole php script)

preg_match_all('/\(\s*?"(?:SELECT|INSERT|UPDATE|DELETE) .*?"\s*?\)\s*?;/is', 
               $script, $matches);

It should match all possible SELECT, INSERT, UPDATE, DELETE statements, if they're placed within parentheses and double quotes. It's case insensetive and should match statements that span across multiple lines too.

edit #1: Regex for matching CRUD statement like string assignments;

preg_match_all('/\$\w+\s*?=\s*?"(?:SELECT|INSERT|UPDATE|DELETE) .*?"\s*?;/is', 
               $script, $matches);

edit #2:

// $variable detecting version of #1 regex
preg_match_all('/\(\s*?"(?:SELECT|INSERT|UPDATE|DELETE) .*?(?:\$\w+){1}.*?"\s*?\)\s*?;/is', 
                   $script, $matches);

Imran 2008-08-21 17:28:59

Answer 6

+1 A:

I would say it would be best to look for function calls instead of looking for SQL itself. Possibly modify the PHP parser to look for function calls that result in running an SQL query which is not a prepared query.

Kibbee 2008-08-21 19:28:46

ansaurus

tags:

views:

answers:

Best Approach to Parse for SQL in PHP Files?

related questions