views:

265

answers:

5

Hey,

I'm taking a look at how to properly escape data that comes from the outside world before it gets used either for application control, storage, logic.. that kind of thing.

Obviously, with the magic quotes directive being deprecated shortly in php 5.3.0+, and removed in php6, this becomes more pressing, for anyone looking to upgrade and get into the new language features, while maintaining legacy code (don't we love it..).

However, one thing that I haven't seen is much discussion about theory/best practice with what to do once you have protected your data - for example, to store with or without slashes? I personally think keeping escaped data in the DB is a bad move, but want to hear discussion and read some case studies preferably..

Some links from the PHP manual just for reference:

PHP Manual - mysql_real_escape_string

PHP Manual - htmlspecialchars

etc etc.

Any tips?

A: 

It's simple. ALL incoming data should be ran through mysql_real_escape_string() before inserting it into the database. If you know something needs to be an integer for example, set it to an integer before inserting it, etc. Remember this is just to stop SQL injection. XSS and data validation are different.

If you want something to be an email, you obviously need to validate that before you insert it into the database.

htmlentities() sanitizes data, meaning it modify the data. I think you should always store raw data in the database and when you grab that data, choose how you want to sanitize it then.

I like to use the following function as a "wrapper" for the mysql_real_escape_string() function.

function someFunction( $value )
{
    if ( is_int( $value ) || is_float( $value ) ) {
     return $value;
    }
    return "'" . mysql_real_escape_string( (string) $value ) . "'";
}

If the value is a float or an integer, then there is no point in running mysql_real_escape_string(). The reason I cast the value to a string before passing it to mysql_real_escape_string(), is because sometimes the value might not be a string.

An example of the value not being a string:

http://localhost/test.php?hello[]=test

Inside test.php, you run mysql_real_escape_string() on $_GET['hello'] expecting hello to be a string. Well since the person set the value to an array, it will actually cause a notice since hello is not a string.

William
You are not totally clear in your answer. Is your suggestion not to use htmlentities?
tharkun
His suggestion is to always use mysql_real_escape_string before DB storage
Neil Aitken
When storing it in the database? Yes, my suggestion is not to use htmlentities. If I'm outputting data to the browser for instance, thats user data, then yes I'll run it through htmlentities. Sometimes I might not want to for certain cases, the point is I have that option when I access that data.
William
+6  A: 

Take a look at prepared statements. I know in mysql this works very well and is a secure form of getting data in to your databse. It has a few performance benefits too.

http://dev.mysql.com/tech-resources/articles/4.1/prepared-statements.html

I have some more resources if you are interested.

Hope this is what you are looking for, tc.

Edit:

One thing i can add is using filters in combination with prepared statements. For example to check if the value is a sting you use FILTER_SANITIZE_STRING, or for the email you use FILTER_SANITIZE_EMAIL.

This saves some amount of code and works very well. You can always check the data using your own methods afterwards, but there are a lot of filters you can use.

Saif Bechan
Thanks, will read through!
danp
I've been looking into this - you mentioned a few more resources, if you have a chance, I'd appreciate some links! Thanks again for the tip.
danp
Check out these two links:http://net.tutsplus.com/tutorials/php/getting-clean-with-php/ and http://net.tutsplus.com/tutorials/other/top-20-mysql-best-practices/One if for php and one for mysql, just some best practices, enjoy!
Saif Bechan
+2  A: 
  • Use correct method of escaping data when running queries: mysql_real_escape_string, prepared queries, etc...

  • Store data in database unaltered

  • Use correct method of escaping data on output: htmlspecialchars, etc..

Galen
Thank you, not enough people realize that data needs to be escaped for different reasons on both sides. SQL injection has become less of a concern, it seems many people understand it, but XSS has become increasingly common and I'm glad to see people like Galen talking about it and how to prevent it.
Chuck Vose
+1  A: 

For database inserts the solution is to use bind variables.

In general, any time you find yourself escaping anything (argument to a shell command, db command piece, user-supplied html, etc.), it indicates that you're not using the right function call (e.g., using system when you could use a multi-arg form of exec), or that your framework is deficient. The standard approach to working in a deficient framework is to enhance it so that you can return to not thinking about quoting.

Thinking about levels of escaping and levels of quoting can be fun, but if you really enjoy that go play with Tcl in your spare time. For real work, you shouldn't be thinking about quoting unless you're designing a library for other people to use, in which case you should quote properly and let your users avoid thinking about quoting. (And you should document very carefully exactly what kind of quoting you do and don't do)

Daniel Martin
+2  A: 

For database work, check parameterized queries and prepared statements. PDO and mysqli are good for that.

Htmlspecialchars is the right tool to display some text in html documents.

And, as you mentionned php 5.3, you have access to the filter functions which are a must-use when handling user data.

Arkh