views:

1087

answers:

4

How do you sanitize data in $_GET -variables by PHP?

I sanitize only one variable in GET by strip_tags. I am not sure whether I should sanitize everything or not, because last time in putting data to Postgres, the problem was most easily solved by the use of pg_prepare.

+3  A: 

If you're talking about sanitizing output, I would recommend storing content in your database in it's full, unescaped form, and then escaping it (htmlspecialchars or something) when you are echoing out the data, that way you have more options for outputting. See this question for a discussion of sanitising/escaping database content.

In terms of storing in postgres, use pg_escape_string on each variable in the query, to escape quotes, and generally protect against SQL injection.

Edit:

My usual steps for storing data in a database, and then retrieving it, are:

  1. Call the database data escaping function (pg_escape_string, mysql_escape_string, etc), to escape each incoming $_GET variable used in your query. Note that using these functions instead of addslashes results in not having extra slashes in the text when stored in the database.

  2. When you get the data back out of the database, you can just use htmlspecialchars on any outputted data, no need to use stripslashes, since there should be no extra slashes.

Kazar
If you give the escaped data for the user, he then sees the slashes. **Which function do you use to make the data readable again for the user?**
Masi
You can use stripslashes (http://uk2.php.net/stripslashes) to remove slashes from the string.
Kazar
**Is there any system-wide option which would make my data sanitized in the variables `$_GET` and `$_POST` without needing to put the `htmlspecialvars` and `stripslashes` -statements to every statament?**
Masi
No, I'm afraid not. Why do you need to do stripslashes on incoming data? Do you have magic quotes turned on?
Kazar
@Kazar: I do not have magic quotes. **Which function do you use to make the data readable by users**? -- I want to know that before sanitizing data by `htmlspecialchars` -function.
Masi
@Masi: I've added something to the answer, since the comment box is not very flexible. I hope I understand what you mean by 'readable' - if you refer more to the concept of converting machine-readable values (boolean values, numbers, etc), to human-readable ones, then that is a whole other matter.
Kazar
**How do you sanitize arrays?**
Masi
As in, store an array in a database? Serialize the array (http://us2.php.net/manual/en/function.serialize.php), and then call pg_escape_string on it. (That will give you a string that represents the array, which you can safely store in a database. When retrieving the value, use the deserialize function on the string to turn it back into an array.
Kazar
If you mean, how do you sanitize each element of an array, use array_map combined with pg_escape_string.
Kazar
**Why do you need to serialize an array?** My tags for questions seem to be correct when I use only `pg_escape_string` to them. I did not use `htmlentities to them because this gives me an unexpected error that `an array passed to a string`.
Masi
You don't need to serialize an array in your case: that would be a way to put the array inside the database, which is not what you're trying to do. Please forget the array; you can't do anything useful on the array or all the string in it at once. Instead, every single string variable needs to be escaped, individually, at the point you output it inside SQL or HTML. That applies whether you got those strings from the GET array, output from the database, or somewhere else entirely.
bobince
A: 

You must sanitize all requests, not only POST as GET.

You can use the function htmlentities(), the function preg_replace() with regex, or filter by cast:

<?
$id = (int)$_GET['id'];
?>

[]'s

fvox
I agree with you that all requests must be espaced.However, if you use `pg_prepare`, then you do not need the function `htmlentitiies` in my opinion, since `pg_prepare` sanitizes the data.
Masi
Don't use a regex to sanitise output - you are bound to miss something, and accidentally expose a XSS vulnerability - there are good libraries for doing custom output sanitisation (http://htmlpurifier.org/)
Kazar
Masi: that's wrong and dangerous. pg_prepare deals with SQL escaping. It has nothing whatsoever to do with HTML escaping, and will not protect you in any way from XSS attacks caused by a stray ‘<’ or ‘"’ character.
bobince
+1  A: 

Sanitize your inputs according to where it is going.

  • If you display it (on a page or as an input field's value), use htmlspecialchars and/or str_replace.
  • If you use it as another type, cast it.
  • If you include it in SQL query, escape it using the appropriate function, maybe strip html tags if you do want those to be totally removed (which is not the same as escaped).

Same for POST or even data from your DB, since the data inside your DB should generally not be escaped.

Two things you should check:

  1. Encoding of your input vs. your PHP scripts / output / DB table
  2. If you have [magic_quotes_gpc][1] enabled, you should either disable it (whenever you can) or stripslashes() GET, POST and COOKIE values. magic_quotes_gpc is deprecated, you should sanitize the data you manipulate, depending on the use of that data.
streetpc
+5  A: 

How do you sanitize data in $_GET -variables by PHP?

You do not sanitize data in $_GET. This is a common approach in PHP scripts, but it's completely wrong*.

All your variables should stay in plain text form until the point when you embed them in another type of string. There is no one form of escaping or ‘sanitization’ that can cover all possible types of string you might be embedding your values into.

So if you're embedding a string into an SQL query, you need to escape it on the way out:

$sql= "SELECT * FROM accounts WHERE username='".pg_escape_string($_GET['username'])."'";

And if you're spitting the string out into HTML, you need to escape it then:

Cannot log in as <?php echo(htmlspecialchars($_GET['username'], ENT_QUOTES)) ?>.

If you did both of these escaping steps on the $_GET array at the start, as recommended by people who don't know what they're doing:

$_GET['username']= htmlspecialchars(pg_escape_string($_GET['username']));

Then when you had a ‘&’ in your username, it would mysteriously turn into ‘&amp;’ in your database, and if you had an apostrophe in your username, it would turn into two apostrophes on the page. Then when you have a form with these characters in it is easy to end up double-escaping things when they're edited, which is why so many bad PHP CMSs end up with broken article titles like “New books from O\\\\\\\\\\\\\\\\\\\'Reilly”.

Naturally, remembering to pg_escape_string or mysql_real_escape_string, and htmlspecialchars every time you send a variable out is a bit tedious, which is why everyone wants to do it (incorrectly) in one place at the start of the script. For HTML output, you can at least save some typing by defining a function with a short name that does echo(htmlspecialchars(...)).

For SQL, you're better off using parameterised queries. For Postgres there's pg_query_params. Or indeed, prepared statements as you mentioned (though I personally find them less managable). Either way, you can then forget about ‘sanitizing’ or escaping for SQL, but you must still escape if you embed in other types of string including HTML.

strip_tags() is not a good way of treating input for HTML display. In the past it has had security problems, as browser parsers are actually much more complicated in their interpretation of what a tag can be than you might think. htmlspecialchars() is almost always the right thing to use instead, so that if someone types a less-than sign they'll actually get a literal less-than sign and not find half their text mysteriously vanishing.

(*: as a general approach to solving injection problems, anyway. Naturally there are domain-specific checks it is worth doing on particular fields, and there are useful cleanup tasks you can do like removing all control characters from submitted values. But this is not what most PHP coders mean by sanitization.)

bobince
Could you please explain more what you mean by this *you must still escape if you embed in other types of string including HTML*. **Do you mean that I do not need to escape usernames and emails which I put to my db when I use prepared statements?** - I only show the username for the user inside HTMLL taken from the db. Your answer suggests me that I need to `pg_espace_string` to usernames.
Masi
No: for outputting to an SQL string you use SQL escaping (or let parameterised/prepared statements do it for you automatically — but not both). For outputting to HTML you use HTML escaping (with htmlspecialchars). Never put HTML-escaped text in an SQL string, or SQL-escaped text on an HTML page.
bobince