views:

302

answers:

9

Currently I have some theoretically background in regular expression, but I have almost never used them.

I am trying to develop some classes for general input validation, and I have being writing methods without any use of regular expressions. I recently read this Jeff's article, and now I am wondering if I should refactor some of the methods to include regexp inside them.

I thought that regular expressions were used to build front-ends for applications like parsers and anything else, but apparently they are used for much more than that.

I realize that not all validations can or should be done with regular expressions, but are they a good practice to validate inputs?

+2  A: 

I realize that not all validations can or should be done with regular expressions

You're exactly right on that one, but for things that need to be exactly a certain way (for example social security numbers, phone numbers (to some extent) and emails), using regular expressions can be helpful.

But do not rely on them. For example I have a pretty good email regular expression check but I also have a list of obviously bogus domains (example.com and some others i've seen in our database (mostly local stuff)) to match against.

Ólafur Waage
+1  A: 

Yes - regular expressions work very well for input validation. However, often times it's a very good idea to abstract these things away as much as possible as other methods - or even sometimes special validator objects.

Remember that regular expressions can often introduce a lot of trouble, but on the whole, input validation is a case where they fairly unconditionally shine.

Tony k
A: 

I don't know that I'd call it a best practice, but I certainly use regex for validation of things like email adresses and ops, among other things. If not a best practice, it's certainly a common practice.

dustyburwell
+4  A: 

Regular expressions are just one way to match text against a pattern. There are other ways to do the same thing without using a regex. You shouldn't think of regular expressions as a buzzword that you must include in your code. Use whatever tool works the best.

For input validation just be sure whatever tools you're using let you specify exactly what kind of text you want to accept and reject everything else by default. Regular expressions let you do this easily and concisely for certain kinds of input, which is why many people use them.

Brian Carper
+2  A: 

Yes!

Regular expressions usually let you build a pretty solid input validation that's fairly readable in a very short space of time.

Something that does the right job, is maintainable and lets you get onto other things is good in my books.

As always, apply common sense and if a regex is a bad tool for the job, don't use it.

Artelius
+3  A: 

Using regexp validation is a good idea provided that you don't branch off into applying more than besic regular expressions:

If you find yourself validating potentially complex structures such as Michael Ash does in his attempt to verify a date you are off the beaten path and asking for trouble:

^(?:(?:(?:0?[13578]|1[02])(\/|-|\.)31)\1|(?:(?:0?[13-9]|1[0-2])(\/|-|\.)(?:29|30)\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:0?2(\/|-|\.)29\3(?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|\.)(?:0?[1-9]|1\d|2[0-8])\4(?:(?:1[6-9]|[2-9]\d)?\d{2})$

Your code will suffer maintenance problems.

ojblass
+1  A: 

The answer to your question really depends upon the purpose of your code. Yes, regular expressions are great and I agree, with all the previous answers (that I have read).

Using regular expressions, is a handy, quick and sleak way to validate certain inputs but perhaps what you need to remember what makes good code and apply the regular expressions in the correct places. I read the article you posted, and I thought the subject matter was about more about using regular expressions in the correct manner i.e. don't just use them for a solution because you know it will be quick to type up and it will work but in turn produces un-readable, lengthy and horrible looking code.

I wouldn't take it from what is written that regular expressions are "bad practice." I guess, he just wanted to put accross that sometimes you can spend a few more minutes considering design and come up with a better concept to implement or just conclude that regular expressions are that concept!

Graham
A: 

You should validate on both the client and server sides. Regular expressions are very good for making sure that a string has a valid format (e.g, e-mail addresses, phone numbers, etc.), but the server should not depend solely on that. The server should check on its own and also validate business correctness (e.g., like the answer above that checked for bogus addresses in a database).

Once is not enough. There are different degrees of "valid".

duffymo
A: 

If the input you are validating is in a regular language, then a regular expression is the right tool to validate it.

Svante