Hi guys, im wondering about how to set up a clever way to have all my input 'clean', a procedure to run at the begin of every my script. I thought to create a class to do that, and then, add a 2 letter prefix in the begin of every input to identify the kind of input, for example:
in-mynumber
tx-name
ph-phone
em-email
So, at the top of my scripts i just run a function (for example):
function cleanInputs(){
foreach($_GET AS $taintedKey => $taintedValue){
$prefix = substr($taintedKey, 0, 2);
switch($prefix){
case 'in':
//I assume this input is an integer
$cGet[$taintedKey] = intval($taintedValue);
break;
case 'tx':
//i assume this input is a normal text
//can contains onely letters, numbers and few symbols
if(preg_match($regExp, $taintedValue)){
$cGet[$taintedKey] = $taintedValue;
}else{
$cGet[$taintedKey] = false;
}
break;
case 'em':
//i assume this input is a valid email
if(preg_match('/^[a-zA-Z0-9-_.]+@[a-zA-Z0-9-_.]+.[a-zA-Z]{2,4}$/', $taintedValue)){
$cGet[$taintedKey] = $taintedValue;
}else{
$cGet[$taintedKey] = false;
}
break;
}
}
}
..so i'll create other 2 arrays, $cGet and $cPost with the clean data respectively of $_GET and $_POST, and in my script i'lllook for use those arrays, completely forget the $_GET/$_POST I'm even thinkin about add a second prefix to determinate the input's max lenght... for example: tx-25-name ..but im not pretty sure about that.. and if i take this way, maybe a OOP approach will be better.
What do you think about that? Seem be a good way to use?
The negatives point that i can actually see (i havent still used that way, is just a wonder of this morning) 1. The prefix, and so the procedures, must be many if i want my application not to be much restrictive; 2. My sent variable's names will become little longer (but we are talking of 3-6 chars, shouldnt be a problem)
Any suggestion is really appreciated!
EDIT:
Im not triyn to reinvent the wheel, my post was't about the sistem to sanitizing input, but is about the procedure to do it. I use htmlpurifier to clen the possibly xss injection in html data, and of course i use the parametrized queryes. Im just wondering if is better take input by input, or sanitize them all at the begin and consider they clean in the rest of the script. The method i thougt is not miracolous and nothing new under the sun, but i think that truncate the input if is not in the format that i aspect, can be usefull...
Why check for sql injection in the 'name' field, that must contain just letters and the apostrophe char? Just remove everythings that is not letter or apostophe, add slashes for the last one, and run into a parametrized query. Then, if you aspect an email, just delete everythings that is not an email..