"santizing input" is a nonsense. Input data cannot be "dirty" as such, the meaning of characters depends on the device you're sending them to and this is where you should "sanitize" the data (NB: not only user input, all data). For example, a newline symbol is completely harmless when outputting to browser, but can lead to injection if sent per email. Therefore, don't waste your time checking input for "invalid" characters, instead, think about output escaping (html, email, sql statements, shell commands etc).
// edit
because i obviously did a bad job explaining myself,
here's an illustration (for the benefit of those who happen to google this topic out)
Fig.1. Data flow in a typical web script.
A--------------------------B------------C--------------D------E
------------------------- ------------------
: user input :-------> : :-------> http browser
------------------------- : :-------> database
: your script :-------> mail server
------------------------- : :-------> file
: other data sources :-------> : :-------> shell
------------------------- ------------------
we obtain data from user and/or external source (A), do something with the data (C) and send it to an extenal facility (E).
Some people insist that "sanitizing" should happen at step B (receiving data), my point is that should happen on step D (sending data).
Why? Every facility has its own escaping rules, and on step B you normally don't know which one you're going to use. So which rules should you take?
If you arbitrary choose rules for X (say, database), you will run into big problems when trying to send data to Y instead (say, browser).
The classical example of this wrong approach is notorious "magic_quotes". If you ever seen slashed quotes in your html forms, you probably understand what i'm talking about.