views:

215

answers:

3

Hi all, I am currently working on a building community website in PHP. This contains forms that a user can fill right from registration to lot of other functionality. I am not an Object-oriented guy, so I am using functions most of the time to handle my application. I know I have to learn OOPS, but currently need to develop this website and get it running soon.

Anyway, here's a sample of what I let my app. do: Consider a page (register.php) that has a form where a user has 3 fields to fill up, say: First Name, Last Name and Email. Upon submission of this form, I want to validate the form and show the corresponding errors to the users:

<form id="form1" name="form1" method="post" action="<?php echo $_SERVER['PHP_SELF']; ?>"> 

<label for="name">Name:</label>  
<input type="text" name="name" id="name" /><br />

<label for="lname">Last Name:</label>   
<input type="text" name="lname" id="lname" /><br />

<label for="email">Email:</label>   
<input type="text" name="email" id="email" /><br />


<input type="submit" name="submit" id="submit" value="Submit" />
</form>

This form will POST the info to the same page. So here's the code that will process the POST'ed info:

<?php

require("functions.php");

if( isset($_POST['submit']) )
    {
     $errors = fn_register();

     if( count($errors) )
     {
      //Show error messages
     }
     else
     {
      //Send welcome mail to the user or do database stuff...
     }

    }

?>




<?php

//functions.php page:

function sql_quote( $value )
{
     if( get_magic_quotes_gpc() )
    {

          $value = stripslashes( $value );
    }
    else
    { 
          $value = addslashes( $value );
    } 
    if( function_exists( "mysql_real_escape_string" ) )
    {
          $value = mysql_real_escape_string( $value );
    }

    return $value;

}


function clean($str) {
$str = strip_tags($str, '<br>,<br />');
$str = trim($str);
$str = sql_quote($str); 

return $str;

}


 foreach ($_POST as &$value)  
    {
               if (!is_array($value)) 
     { 
                       $value = clean($value); 

               }
               else 
     { 
                       clean($value);
               }
       }

 foreach ($_GET as &$value)  
    {
               if (!is_array($value)) 
     { 
                       $value = clean($value); 
               }
               else 
     { 
                       clean($value);
               }
       } 


function validate_name( $fld, $min, $max, $rule, $label ) {

    if( $rule == 'required' ) 
    {    
        if ( trim($fld) == '' ) 
        {
            $str = "$label: Cannot be left blank.";
            return $str;
        }        
    }


    if ( isset($fld) && trim($fld) != '' ) 
    {    
        if ( isset($fld) && $fld != '' && !preg_match("/^[a-zA-Z\ ]+$/", $fld)) 
    {
            $str = "$label: Invalid characters used! Only Lowercase, Uppercase alphabets and Spaces are allowed";
        }

    else if ( strlen($fld) < $min or strlen($fld) > $max )  
    {
            $curr_char = strlen($fld);
            $str = "$label: Must be atleast $min character &amp; less than $max char. Entered characters: $curr_char";
        }
        else    
        {
            $str = 0;
        }        
    }
    else
    {
        $str = 0;
    }

    return $str;   
}


function validate_email( $fld, $min, $max, $rule, $label ) {

    if( $rule == 'required' ) 
    {    
        if ( trim($fld) == '' ) 
        {
            $str = "$label: Cannot be left blank.";
            return $str;
        }        
    }


    if ( isset($fld) && trim($fld) != '' ) 
    {    
        if ( !eregi('^[a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.([a-zA-Z]{2,4})$', $fld) ) 
    {
            $str = "$label: Invalid format. Please check.";
        }
    else if ( strlen($fld) < $min or strlen($fld) > $max )  
    {
            $curr_char = strlen($fld);
            $str = "$label: Must be atleast $min character &amp; less than $max char. Entered characters: $curr_char";
        }
        else    
        {
            $str = 0;
        }        
    }
    else
    {
        $str = 0;
    }

    return $str;   
}

function val_rules( $str, $val_type, $rule='required' ){

    switch ($val_type) 
    {
     case 'name':
       $val = validate_name( $str, 3, 20, $rule, 'First Name');
     break;

     case 'lname':
       $val = validate_name( $str, 10, 20, $rule, 'Last Name');
     break;        

     case 'email':
       $val = validate_email( $str, 10, 60, $rule, 'Email');
     break;


    }

    return $val;
}


function fn_register() {

    $errors = array();

    $val_name        = val_rules( $_POST['name'], 'name' );
    $val_lname           = val_rules( $_POST['lname'], 'lname', 'optional' );
    $val_email       = val_rules( $_POST['email'], 'email' );

    if ( $val_name != '0' )      { $errors['name']   = $val_name;  }
    if ( $val_lname != '0' )     { $errors['lname']  = $val_lname; }
    if ( $val_email != '0' )     { $errors['email']  = $val_email; }

     return $errors;
}

//END of functions.php page
?>

OK, now it might look like there's a lot, but lemme break it down target wise: 1. I wanted the foreach ($_POST as &$value) and foreach ($_GET as &$value) loops to loop through the received info from the user submission and strip/remove all malicious input.

  1. I am calling a function called clean on the input first to achieve the objective as stated above. This function will process each of the input, whether individual field values or even arrays and allow only
    tags and remove everything else. The rest of it is obvious.

  2. Once this happens, the new/cleaned values will be processed by the fn_register() function and based on the values returned after the validation, we get the corresponding errors or NULL values (as applicable).

So here's my questions: 1. This pretty much makes me feel secure as I am forcing the user to correct malicious data and won't process the final data unless the errors are corrected. Am I correct?

  1. Does the method that I follow guarantee the speed (as I am using lots of functions and their corresponding calls)? The fields of a form differ and the minimum number of fields I may have at any given point of time in any form may be 3 and can go upto as high as 100 (or even more, I am not sure as the website is still being developed). Will having 100's of fields and their validation in the above way, reduce the speed of application (say upto half a million users are accessing the website at the same time?). What can I do to improve the speed and reduce function calls (if possible)?

3, Can I do something to improve the current ways of validation?

I am holding off object oriented approach and using FILTERS in PHP for the later. So please, I request you all to suggest me way to improve/tweak the current ways and suggest me if the script is vulnerable or safe enough to be used in a Live production environment. If not, what I can do to be able to use it live?

Thank you all in advance.

A: 

To answer your questions:

  1. On cursory inspection, it does look like your code will strip out malicious data. If you make sure that your code is broken into small functions that operate only on values passed into them as arguments, you can write yourself a command-line-driven test suite that lets you create lots and lots of test cases that you can re-run any time you make changes to the system so that you are confident that no vulnerabilities exist.
  2. In general, the slowest part of any web application is the database layer. Poorly crafted queries and/or too many queries will overwhelmingly overshadow PHP code execution in terms of performance issues. In other words, don't worry about over-optimizing your PHP code.
  3. Personally, I would improve your code by wrapping everything in a function(). I would also clean $_POST and $_GET values as needed rather than all at once. In other words, I would call clean() on $str in val_rules(). Also, I find it really confusing when using include() or require() to pull in a script and then realizing that that script executes and performs some function "behind the scenes". I think you'll find that using include() or require() works best when the files that are being pulled in are simply used to declare functions, constants, and classes. It is important to be able to clearly follow all execution "out in the open", that is, on the page that is actually doing some work. Similarly, I would not allow the fn_register() function to directly access the $_POST array without passing this in as an argument instead. It's important to be able to look at a function when reading some code and see immediately what data it is acting on without having to look at the function definition. In other words, all data that a function is acting on should ideally be passed in as a value. Finally, and this is a major issue, your name validation is too strict. It will not tolerate hyphens, apostrophes, accented characters, etc. I think you'll find that being liberal in name validation is the best policy. Some culture's naming conventions don't even have what we consider to be first and last names, for example. Another issue: instead of using 'required' and 'optional', use a boolean value for required. If it is set to TRUE, then the field is required.
jkndrkn
Hi, Thank you for your reply. Here's my response for your 3 points.1. Are there any tools to write the command-line-driven test suite? Can you please list them out?2. Got it.3. a)I got the part where you mentioned about applying the functions on $str in val_rules() and the strict name validation part. I failed to understand the include() and require() part. Can you please cite an example?
Devner
3 b) I allowed fn_register() function to directly access the $_POST array as $_POST is 'superglobal' and my target was to keep my register.php page separated from validation code and instead include it in functions.php page. I only return the results of the validation. Can you please suggest an alternative way if I should not allow the function to access $_POST variables directly and still keep the validation in a different page? I might even have 100 fields(or more), so passing them as argument will not be an option. Look forward to your reply. Thanks again.
Devner
You are welcome :] Please understand that some of my suggestions are biased and informed by a personal style influenced by functional programming and test-driven design. 1. I've used phpunit and it has served me well. It does, however, require that your code be organized into classes. http://www.phpunit.de/
jkndrkn
3. a) Your functions.php modifies the contents of $_POST and $_GET. A programmer that has never worked with your code will not be able to know that functions.php is making changes to $_POST and $_GET when this file is include()ed. The code could be made clearer if that functionality was wrapped in a function perhaps named clean_input() that accepts an array or an array of arrays as input and might be called as follows: clean_input(array($_POST, $_GET)).
jkndrkn
3. b) I would call fn_register() by explicitly passing in the $_POST variable: fn_register($_POST). This will allow you to integrate fn_register() into a test suite if you are so inclined and it will make it very clear to anyone reading the page of your form that it is extracting information from the $_POST superglobal.
jkndrkn
Devner
jkndrkn
My solution is just a collection of advice born of experience. If you find it useful, then go ahead and accept it. If not, then don't. Good luck :]
jkndrkn
Awesome! Your responses have been really helpful. Glad to know that some good people with real knowledge are around.Is there any specific way that I should follow in case I come across any particular question about the same topic or should I just just ask the question in a totally different post?Appreciate all your help. Thank you.
Devner
You are welcome :] I would recommend opening up a new question as it will likely get more visibility than simply editing an answered question.
jkndrkn
Thanks very much for all your valuable advices. Appreciate that.Also I happened to post another question in another post, as per your advice. It involves form-submission and jQuery(Ajax stuff). So if that's something that interests you, please feel free to reply to that question. The URL to the question is:http://stackoverflow.com/questions/1775625/jquery-multiple-form-submission-trigger-unrelated-multiple-error-messagesThanks again.
Devner
Sorry, but I had to down-vote. The very first line of code is vulnerable to XSS attacks, and you did not mention a word about XSS. Nothing personal.
Kai Sellgren
A: 

few days ago I wrote an article about validation functionality.

article is here http://vitana-group.com/article/php/validation

may be it helps somebody yet

Universal
@Universal Thanks for your article.
Devner
+1  A: 

Your script is vulnerable to cross-site scripting attacks in the first line. The value of $_SERVER['PHP_SELF'] can not be trusted. You must filter it. Here's a simple fix:

<form id="form1" name="form1" method="post" action="<?php echo htmlspecialchars($_SERVER['PHP_SELF'], ENT_QUOTES, 'utf-8'); ?>"> 

replace UTF-8 with other encoding if needed.

You can try input like

http://site.com/page.php?a=%22onload%3D%22javascript%3Aalert('XSS')%22

against your site to see if it pop ups an alert.

Update: it seems that at least in PHP 5.3, your code snippet is not vulnerable to XSS. However, I can not guarantee (neither can you) what PHP_SELF will produce in older versions of PHP, or in the future. The value of PHP_SELF must be encoded.

Kai Sellgren
@Kai, Thank you very much for indicating the same and for the fix. I was thinking over your fix as well and I am wondering how it would help the situation. Could you please tell me how? If I hard code the page name to which the form should post to, will that be a better option? How can I try to exploit my existing form for a better understanding? I am asking this so that I can try that on my own forms and then use your solution to see the difference. Your help is appreciated. Look forward for your reply. Thank you.
Devner
Hard coding would be a better choice, of course, but the point is that you can't allow a user to place arbitrary code on your site. The PHP function will encode HTML entities so that no one can break out of your action="" attribute, and execute JavaScript, for instance. See my edited answer.
Kai Sellgren
@Kai, thanks for the updated code. I tried it against few pages in my website and it did NOT pop up an alert. I entered something like: http://localhost/reviews/index.php?a=%22onload%3D%22javascript:alert('XSS')%22 and it loaded the page normally. Does that mean that I am safe?
Devner
After some research, I found out that it was REQUEST_URI that would have made you vulnerable, and not PHP_SELF. However, you still need to encode it, because you can't guarantee what comes from the PHP_SELF.
Kai Sellgren