tags:

views:

234

answers:

7

I'm building up a library of filters for a validation class in PHP, some of them using regular expressions. I have a lot of filters in mind, but I also don't want to potentially miss any. What do you most often use regular expressions to check? What are some of the not-so-common things that you've had to check that would still be useful in a library? Note: I'm not looking for the actual regex code, just what you use it for.

+4  A: 

Regex should be strongly tested with their expected use cases. Hence, it may be difficult to develop a complete and general library. I would aim for a library of functions you know you need now. Then add to this list later, when you have proper test cases.

That said, here are some common use cases:

Numeric Data
Phone numbers
Dates
Zip codes
SSN

Nescio
When you are making your library don't forget that at least a couple of the things on this list vary based on locale.
EBGreen
Good point, EBGreen. Thanks.
VirtuosiMedia
+1  A: 

so you're looking for the type regular expressions we use for validating?

telephone (various international formats), postal code, zip code, credit card #s, email, dates, digits, ssn, urls (http, ftp, ...)

Karl Seguin
+1  A: 

In addition to Nescio's answers...

  • Passwords
  • Email addresses
  • Disallowing characters various charters in text fields like non-alphanumeric characters
Loscas
+1  A: 

SQL injection attack patterns

 '[\s]*--

Password Strength

 ((?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,255})
tom.dietrich
SQL injection should be countered by bound parameters, not attempting to clean up the data.
Andy Lester
Bound parameters won't help you catch the attempt and nuke the jackass who tried it, it will just error out. If you want to catch a person trying to hack you you've got to check your input. You can then use bound parameters for extra safety.
tom.dietrich
A: 

The majority of my RE use is fixing up data given to me by various sources into a standardized format. A lot of exporting excel docs as CSV or tab delimited and then running through a bunch of RE transformations in TextPad.

Instantsoup
A: 

Please see Abigail's canonical Regexp::Common.

http://search.cpan.org/dist/Regexp-Common

Andy Lester
+1  A: 

My main uses for regular expression are:

  • pulling apart text
  • selecting lines in input
  • validating formats
  • analyzing/sanitizing input
  • parsing
  • providing expansive customization (allowing "configurable configurations", shortcuts,...)

A number of these things overlap. But it all has to do with human input. Machine readable and human readable are two different things. Regular expressions help us deal with human-oriented (that we know something about) stuff without needing a complete grammar.

Axeman