views:

571

answers:

5

I have one "go" script that fetches any other script requested and this is what I wrote to sanitize user input:

foreach ($_REQUEST as $key => $value){
 if (get_magic_quotes_gpc()) 
 $_REQUEST[$key] = mysql_real_escape_string(stripslashes($value));  
 else
 $_REQUEST[$key] = mysql_real_escape_string($value); 
}

I haven't seen anyone else use this approach. Is there any reason not to?

EDIT - amended for to work for arrays:

function mysql_escape($thing) {
  if (is_array($thing)) {
 $escaped = array();
 foreach ($thing as $key => $value) {
   $escaped[$key] = mysql_escape($value);
 }     
 return $escaped;
  }
  // else
  if (get_magic_quotes_gpc()) $thing = stripslashes($thing);
  return mysql_real_escape_string($thing);
}

foreach ($_REQUEST as $key => $value){
 $_REQUEST[$key] = mysql_escape($value); 
}
+5  A: 

I find it much better to escape the data at the time it is used, not on the way in. You might want to use that data in JSON, XML, Shell, MySQL, Curl, or HTML and each will have it's own way of escaping the data.


Lets have a quick review of WHY escaping is needed in different contexts:

If you are in a quote delimited string, you need to be able to escape the quotes. If you are in xml, then you need to separate "content" from "markup" If you are in SQL, you need to separate "commands" from "data" If you are on the command line, you need to separate "commands" from "data"

This is a really basic aspect of computing in general. Because the syntax that delimits data can occur IN THE DATA, there needs to be a way to differentiate the DATA from the SYNTAX, hence, escaping.

In web programming, the common escaping cases are: 1. Outputting text into HTML 2. Outputting data into HTML attributes 3. Outputting HTML into HTML 4. Inserting data into Javascript 5. Inserting data into SQL 6. Inserting data into a shell command

Each one has a different security implications if handled incorrectly. THIS IS REALLY IMPORTANT! Let's review this in the context of PHP:

  1. Text into HTML: htmlspecialchars(...)

  2. Data into HTML attributes htmlspecialchars(..., ENT_QUOTES)

  3. HTML into HTML Use a library such as HTMLPurifier to ENSURE that only valid tags are present.

  4. Data into Javascript I prefer json_encode. If you are placing it in an attribute, you still need to use #2, such as

  5. Inserting data into SQL Each driver has an escape() function of some sort. It is best. If you are running in a normal latin1 character set, addslashes(...) is suitable. Don't forget the quotes AROUND the addslashes() call:

    "INSERT INTO table1 SET field1 = '" . addslashes($data) . "'"

  6. Data on the command line escapeshellarg() and escapeshellcmd() -- read the manual

-- Take these to heart, and you will eliminate 95%* of common web security risks! (* a guess)

gahooa
Unfortunately, my character set is UTF-8 and not limited to latin characters. I've got it down so that everything is escaped properly, but I guess I'm just overly paranoid that I'll leave something open of SQL injection.
Greg
mysql_real_escape_characters is the better option for sanitizing user input that go into the database.
Residuum
@Residuum: I can't seem to find that in the manual.
Greg
this is good advice, store the RAW representation and then escape it in a way that's appropriate for the medium. Or, you can store both representations as we do on SO. But whatever you do, keep the RAW version around or you *will* be sorry -- trust me on that.
Jeff Atwood
+2  A: 

Your approach tries to sanitize all the request data for insertion into the database, but what if you just wanted to output it? You will have unnecessary backslashes in your output. Also, escaping is not a good strategy to protect from SQL exceptions anyway. By using parametrized queries (e.g. in PDO or in MySQLi) you "pass" the problem of escaping to the abstraction layer.

Ignas R
+3  A: 

If you have arrays in your $_REQUEST their values won't be sanitized.

yjerem
+1 for agreement. Recursive sanitizing is required.
thephpdeveloper
Thanks for pointing that out. I edited my question.
Greg
+1  A: 

Apart from the lack of recursion into arrays and the unnecessary escaping of, say, integers, this approach encodes data for use in an SQL statement before sanitization. mysql_real_escape_string() escapes data, it doesn't sanitize it -- escaping and sanitizing aren't the same thing.

Sanitization is the task many PHP scripts have of scrutinizing input data for acceptability before using it. I think this is better done on data that hasn't been escaped. I generally don't escape data until it goes into the SQL. Those who prefer to use Prepared Statements achieve the same that way.

One more thing: If input data can include utf8 strings, it seems these ought to be validated before escaping. I often use a recursive utf8 cleaner on $_POST before sanitization.

fsb
A: 

I've made and use this one:

<?php
function _clean($var){
    $pattern = array("/0x27/","/%0a/","/%0A/","/%0d/","/%0D/","/0x3a/",
                     "/union/i","/concat/i","/delete/i","/truncate/i","/alter/i","/information_schema/i",
                     "/unhex/i","/load_file/i","/outfile/i","/0xbf27/");
    $value = addslashes(preg_replace($pattern, "", $var));
    return $value;
}

if(isset($_GET)){
    foreach($_GET as $k => $v){
        $_GET[$k] = _clean($v);
    }
}

if(isset($_POST)){
    foreach($_POST as $k => $v){
        $_POST[$k] = _clean($v);
    }
}
?>
Mbarry