There is no "one and only" way of filtering input like you describe, since no input is inherently invalid or even necessarily malicious. It's entirely what you do with the input that matters.
For example, suppose you have some text in $_GET['field']
and you are about to compose a SQL query. You need to escape the value using mysql_real_escape_string()
(for MySQL, of course) like so:
$sql = "INSERT INTO some_table (some_field) VALUES ('" . mysql_real_escape_string($_GET['field']) . "')";
This escaping is absolutely crucial to apply to input that you're using in a SQL query. Once it's applied as you see here, even malicious input from a hacker will have no ill effects on your database.
However, this function is both useless and outright wrong to use if you're including $_GET['field]
in some HTML output from your page. In that case, the function htmlspecialchars()
is useful. You might do something like:
echo "<p>Your comments were: " . htmlspecialchars($_GET['field']) . "</p>";
Both these examples are quite safe from "hacker-like inputs." You will not be inserting malicious data into your database or into your HTML. Yet, notice the two forms of escaping are completely different functions, each suited to its use.
By contrast, imagine if you tried to "validate" input for these two uses at the same time. You certainly couldn't allow <
or >
characters, since those could be part of a malicious HTML attack like Cross-Site Scripting. So, visitors who want to write "I think 1 < 3" would be stymied. Likewise, you couldn't allow quote marks for fear of malicious SQL injection attacks, so poor "Miles O'Brien" could never fill out your form!
Proper input escaping is very easy to do, as you use it in different contexts (it's often even easier than validating input!) yet the results are so much better.