views:

242

answers:

5

I currently have a fairly robust server-side validation system in place, but I'm looking for some feedback to make sure I've covered all angles. Here is a brief outline of what I'm doing at the moment:

  • Ensure the input is not empty, or is too long

  • Escape query strings to prevent SQL injection

  • Using regular expressions to reject invalid characters (this depends on what's being submitted)

  • Encoding certain html tags, like <script> (all tags are encoded when stored in a database, with some being decoded when queried to render in the page)

Is there anything I'm missing? Code samples or regular expressions welcome.

+2  A: 

You should encode every html tag, not only 'invalid' ones. This is a hot debate, but basically it boils down to there will always be some invalid HTML combination that you will forget to handle correctly (nested tags, mismatched tags some browsers interpret 'correctly' and so on). So the safest option in my opinion is to store everything as htmlentities and then, on output, print a validated HTML-safe-subset tree (as entities) from the content.

Vinko Vrsalovic
All tags are encoded when stored in a database, but I want some to render in the page so these are decoded when queried, with the exception of some like the script tag. I've amended the question to reflect this.
conmulligan
+1  A: 

Run all server-side validation in a library dedicated to the task so that improvements in one area affect all of your application.

Additionally include work against known attacks, such as directory traversal and attempts to access the shell.

+8  A: 

You shouldn't need to "Escape" query strings to prevent SQL injection - you should be using prepared statements instead.

Ideally your input filtering will happen before any other processing, so you know it will always be used. Because otherwise you only need to miss one spot to be vulnerable to a problem.

Don't forget to encode HTML entities on output - to prevent XSS attacks.

Steve Kemp
+1  A: 

This Question/Answer has some good responses that you're looking for
(PHP-oriented, but then again you didn't specify language/platform and some of it applies beyond the php world):

http://stackoverflow.com/questions/129677/whats-the-best-method-for-sanitizing-user-input-with-php

micahwittman
+1  A: 

You might check out the Filter Extension for data filtering. It won't guarantee that you're completely airtight, but personally I feel a lot better using it because that code has a whole lot of eyeballs looking over it.

Also, consider prepared statements seconded. Escaping data in your SQL queries is a thing of the past.

Bob Somers