views:

59

answers:

5

I am trying to put together a checklist things I need to keep in mind when creating forms. I know I need to filter input content. I already am filtering for errant html and scripts, escaping mysql, and limiting to data types(phone numbers are 10+ digits with training extension digits, email has to be email, strings cannot contain html or code, etc.), and word or character limits (names max out at 4 words separated by whitespace, etc.). But what else should I be doing and what are good ways of doing them?

This validation will be taking place on the server, but I am looking for best practices across platforms. The data will be coming in using POST, so I don;t have to worry too much about mucking about with the url. Also the form presentation, with hinting, js input masking is handled, and pretty much all the client side stuff is in place.

+3  A: 

Validation down to its simplest term: only accepting what you want.

For example, if your telephone field should only include numbers (in no particular phone number format) and no longer than 20 numbers, you can check it against regular expression to make sure that it is what you want to accept, i.e. ([0-9]{7,20})

Another example, Twitter. It only accepts username up to 15 characters, alphanumeric and consisting of underscores. So the validation regex might something be: ([a-zA-Z0-9]{1})([a-zA-Z0-9\_]{0,14})

Form validation can also be in the form of security check. One could be honey potting, form validity and so on.

Form Honey potting: Preventing automated/spamming of your form submissions
Form Validity: Check between the time the form has loaded and the time of form submission. If it is too short, the form might be submitted by a bot. If it took too long, the data might be old and expired.
CAPTCHA: another level of bot prevention / human only form validation.

thephpdeveloper
So I should be somehow tracking when the form is displayed and and check it against when I process the form. Storing time and uid into a session variable and checking against it on form submit, something like that?
Tyson of the Northwest
yep something like that
thephpdeveloper
A: 

The always great smashing magazine has some great tips: http://www.smashingmagazine.com/2009/07/07/web-form-validation-best-practices-and-tutorials/

But if I could offer my own:

  1. Make it secure but usable.
  2. Use client side validation along with server side validation
  3. If you post back with errors, make sure the users' information is still populated in the form
  4. Limit the field size in HTML forms.

Of course, all this is assuming you're using web forms.

Dave
A: 

Commenter S. Lott is correct: Escaping should be taken care of automatically by the framework. If you're not working with an explicit framework, then at the very least, the utility functions you use to access the database and display data on the page should escape for SQL and HTML, respectively. If you have to worry about escaping in your validation code, sooner or later you'll make a mistake, and some twelve-year-old script kiddy will replace the contents of your web site with horse porn.

Thom Smith
A: 

Stuff that makes sense in the context is good, stuff that doesn't make sense is bad.

If this site filtered for HTML, then we couldn't give HTML examples. Instead it processes the HTML so that they are output escaped, rather than as HTML.

Beware of over-validating. < is not necessarily bad, there are all sorts of reasons people will use <, > and especially &.

Likewise, while Robert '); DROP TABLE Students;-- isn't someone you want signing up at your school, if your preventing that means that O'Brien, O'Tierney, O'Donovan and O'Flanagan can't sign up, by the time O'Donnell is refused he's going to think it's anti-Irish racism and sue you! (More realistically, I do know people here in Ireland who go off to find a competitor when a SQL-injection prevention script blocks or mangles their surname - though more often they've just found yet another site that isn't preventing injection, as either will fail on their name in some way).

Validation, as opposed to security-checking is about making sure something plausibly reflects reality. In reality personal names have ' in them and company and town names have & in them all the time, and "validation" that blocks that has turned valid data into invalid. In reality, credit card numbers are 16digits long (some debit cards 19digits) and pass a luhn check, email addresses have a user info part, an @ and a host name with an MX record. People's names are never zero-characters long. That's validation. Only reject (rather than escape) if it genuinely is invalid.

Jon Hanna
A: 

You may want to check out OWASP http://www.owasp.org/index.php/OWASP:About. Especially if you're planning on handling credit cards.

Dave