views:

12323

answers:

9

Is there a catchall function somewhere that works well for sanitizing user input for sql injection and XSS attacks, while still allowing certain types of html tags?

A: 

Regular Expressions: http://us2.php.net/manual/en/regex.examples.php

George Strother
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Nathan Strong
+10  A: 

No. You can't generically filter data without any context of what it's for. Sometimes you'd want to take a SQL query as input and sometimes you'd want to take HTML as input.

You need to filter input on a whitelist -- ensure that the data matches some specification of what you're expect. Then you need to escape it before you use it, depending on the context in which you are using it.

The process of escaping data for SQL - to prevent SQL injection - is very different from the process of escaping data for (X)HTML, to prevent XSS.

Daniel Papasian
A: 

There is the filter extension (howto-link, manual), which works pretty well with all GPC variables. It's not a magic-do-it-all thing though, you will still have to use it.

Till
+5  A: 

To address the XSS issue, take a look at HTML Purifier. It is fairly configurable and has a decent track record.

As for the SQL injection attacks, make sure you check the user input, and then run it though mysql_real_escape_string(). The function won't defeat all injection attacks, though, so it is important that you check the data before dumping it into your query string.

A better solution is to use prepared statements. The PDO library and mysqli extension support these.

jasonbar
there is no "best way" to do something like sanitizing input.. Use some library, html purifier is good. These libraries have been pounded on many times. So it is much more bulletproof than anything ou can come up yourself
paan
See also http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/ . From my understanding WordPress uses an older version, http://core.trac.wordpress.org/browser/tags/2.9.2/wp-includes/kses.php
mrclay
+4  A: 

No, there is not.

First of all, SQL injection is an input filtering problem, and XSS is an output escaping one - so you wouldn't even execute these two operations at the same time in the code lifecycle.

Basic rules of thumb

  • For SQL query, bind parameters (as with PDO) or use a driver-native escaping function for query variables (such as mysql_real_escape_string())
  • Use strip_tags() to filter out unwanted HTML
  • Escape all other output with htmlspecialchars() and be mindful of the 2nd and 3rd parameters here.
Peter Bailey
+73  A: 

It's a common misconception that user input can be filtered. PHP even has a (now deprecated) "feature", called magic-quotes, that builds on this idea. It's nonsense. Forget about filtering (Or cleaning, or whatever people call it).

What you should do, to avoid problems is quite simple: Whenever you embed a string within foreign code, you must escape it, according to the rules of that language. For example, if you embed a string in some SQL targeting MySql, you must escape the string with MySql's function for this purpose (mysql_real_escape_string).

Another example is HTML; If you embed strings within HTML markup, you must escape it with htmlspecialchars. This means that every single echo or print statement should use htmlspecialchars.

A third example could be shell commands; If you are going to embed strings (Such as arguments) to external commands, and call them with exec, then you must use escapeshellcmd and escapeshellarg.

And so on and so forth ...

The only case where you need to actively filter data, is if you're accepting preformatted input. Eg. if you let your users post HTML markup, that you plan to display on the site. However, you should be wise to avoid this at all cost, since no matter how well you filter it, it will always be a potential security hole.

troelskn
Thanks, that was a really clear explanation. A lot better than just "no!" :)
UltimateBrent
"This means that every single echo or print statement should use htmlspecialchars" - of course, you mean "every ... statement outputting user input"; htmlspecialchars()-ifying "echo 'Hello, world!';" would be crazy ;)
Bobby Jack
Good explanation.
Philip Morton
Excellent concise answer! I cringe when I hear about sanitizing input with no regard to the context.
Cory House
There's one case where I think filtering is the right solution: UTF-8. You don't want invalid UTF-8 sequences all over your application (you might get different error recovery depending on code path), and UTF-8 can be filtered (or rejected) easily.
porneL
@porneL: Yes, and it can also be worthwhile to filter out control characters other than newline at this point. However given that most PHP apps can't even get the HTML-escaping right yet I'm not going to push the overlong UTF-8 sequence issue (they're only really an issue in IE6 pre-Service-Pack-2 and old Operas).
bobince
Love the detail and clear explanation
ggfan
Although your answer is helpful, HTML can and is successfully filtered for XSS in numerous applications. E.g. Comment systems in blog software such as WordPress.
mrclay
I realize this is an old question, but as of PHP 5.2.0 PHP has introduced Filters (http://www.php.net/manual/en/book.filter.php) and the function filter_var(), which when passed a value and an appropriate filter will either sanitize or validate the supplied user input.
David O.
+4  A: 
SchizoDuckie
+6  A: 

Do not try to prevent SQL injection by sanitizing input data.

Instead, do not allow data to be used in creating your SQL code. Use parameterized SQL that uses bound variables. It is the only way to be guaranteed against SQL injection.

Please see my website http://bobby-tables.com/ for more about preventing SQL injection.

Andy Lester
A: 

One trick that can help in the specific circumstance where you have a page like /mypage?id=53 and you use the id in a WHERE clause is to ensure that id definitely is an integer, like so:

if (isset($_GET['id'])) {
  $id = $_GET['id'];
  settype($id, 'integer');
  $result = mysql_query("SELECT * FROM mytable WHERE id = '$id'");
  # now use the result
}

But of course that only cuts out one specific attack, so read all the other answers. (And yes I know that the code above isn't great, but it shows the specific defence.)

Hamish Downer