ansaurus

Question

What else should I be doing to sanitize user input?

Answer 1

+5 A:

It looks like there's too much going on in that function. mysql_real_escape_string() already escapes everything you need to escape, so there's no need to run addslashes() on that. In fact, it could do more harm than good by escaping the backslashes mysql_real_escape_string() creates.

Kaivosukeltaja 2009-10-29 10:04:19

Just for hint, there exist a function like mysql_real_escape_string() but for postgresql? pg_escape_string or pg_escape_bytea dont seem to works against injection

DaNieL 2009-10-29 13:30:40

How doesn't `pg_escape_string` work? It should work just fine. `pg_query_params` is a better way to run queries though.

Lukáš Lalinský 2009-10-29 14:45:27

Answer 2

+1 A:

Depends on where the "secured" string will be used. If it's going to be used in the database, you only need mysql_real_escape_string(), nothing more. If it's going to be displayed in html, you only need htmlentities(), nothing more. In short: your code is doing way too much, which could even be harmful.

If you want to store it in the database for displaying it in html lateron (like a comment, for example), you should be using mysql_real_escape_string() when storing the string and htmlentities() when displaying it.

soulmerge 2009-10-29 10:04:50

Answer 3

+2 A:

mysql_real_escape_string is the last step, you shouldn't use it your application logic. It's sole purpose is to pass strings to the database, so use it only when constructing queries. You can pass anything to mysql_real_escape_string and it will make sure you can safely store it in the database.

For the rest, it depends what do you want. If you want to strip tags, use strip_tags, etc.

Lukáš Lalinský 2009-10-29 10:05:05

The string gets outputted to an xml file which in turn is read by a flash application. Thanks

Drew 2009-10-29 10:10:07

If you are adding it to an XML file, you should escape it for XML *when* adding it to an XML file. Here you are dealing with adding the string to a database, so you escape it for the database. You need to use different validation/escaping functions for different purposes.

Lukáš Lalinský 2009-10-29 10:13:41

thanks man! i'm not to worried about sanity when i extract! :)

Drew 2009-10-29 14:14:42

Answer 4

+1 A:

"santizing input" is a nonsense. Input data cannot be "dirty" as such, the meaning of characters depends on the device you're sending them to and this is where you should "sanitize" the data (NB: not only user input, all data). For example, a newline symbol is completely harmless when outputting to browser, but can lead to injection if sent per email. Therefore, don't waste your time checking input for "invalid" characters, instead, think about output escaping (html, email, sql statements, shell commands etc).

// edit

because i obviously did a bad job explaining myself, here's an illustration (for the benefit of those who happen to google this topic out)

 Fig.1. Data flow in a typical web script.

 A--------------------------B------------C--------------D------E

 -------------------------         ------------------
 :  user input           :-------> :                :-------> http browser
 -------------------------         :                :-------> database
                                   : your script    :-------> mail server
 -------------------------         :                :-------> file
 :  other data sources   :-------> :                :-------> shell
 -------------------------         ------------------

we obtain data from user and/or external source (A), do something with the data (C) and send it to an extenal facility (E). Some people insist that "sanitizing" should happen at step B (receiving data), my point is that should happen on step D (sending data).

Why? Every facility has its own escaping rules, and on step B you normally don't know which one you're going to use. So which rules should you take?

If you arbitrary choose rules for X (say, database), you will run into big problems when trying to send data to Y instead (say, browser). The classical example of this wrong approach is notorious "magic_quotes". If you ever seen slashed quotes in your html forms, you probably understand what i'm talking about.

stereofrog 2009-10-29 10:39:50

you also need to sanitize your _input_. if you don't do that, malicious persons could execute any code on your database like `DROP TABLES`. <http://xkcd.com/327/>

knittl 2009-10-29 10:54:37

well, you're totally wrong. try to read my post once againhttp://xkcd.com/386 ;))

stereofrog 2009-10-29 11:15:50

actually, if you take sterofrog's comments about output escaping, meaning the SQL you `output`to the DB, then he could be right. . . but I still say it is a bass ackwards way to look @ it

andrewWinn 2009-10-29 14:58:08

@knittl - No, you need to escape your output - not sanitize input. The difference is subtle and crucial.

troelskn 2009-10-29 15:40:50

@andrewWinn: well, but you do not output sql to the db server, it’s just another input to the server. speaking of output is totally confusing in this case

knittl 2009-10-29 16:09:25

+1 accurate, doesn't deserve the downvoting

bobince 2009-11-27 20:25:42

Answer 5

+4 A:

Better use the new PHP function filter_var() for cleaning input. New and better.

Elzo Valugi 2009-10-29 13:04:12

thanks dude. filter var seems like the best option so far.

Drew 2009-10-29 14:16:29

Answer 6

+1 A:

If your server uses php 5.2 or better, you should use filter_var for the XML part.

$output = filter_var($input, FILTER_SANITIZE_STRING);

To store something into your database, use PDO and parameterized queries.

Arkh 2009-10-29 14:06:07

Answer 7

+1 A:

It's a misnomer to try and fix the problem at input time, since the problem happens at output time. See my answer over here:

What’s the best method for sanitizing user input with PHP?

troelskn 2009-10-29 15:42:41

ansaurus

tags:

views:

answers:

What else should I be doing to sanitize user input?

related questions