views:

658

answers:

6

Hey everybody,

I'm trying to see what would be a good way to validate a US address, I know that there might be not a proper way of doing this, but I'm going for the basic way: #, Street name, City, State, and Zip Code.

Any ideas will be appreciate it. Thanks

+2  A: 

There are way too many variations in address to be able to do this using regular expressions. You're better off finding a web service that can validate addresses. USPS has one - you'll have to request permission to use it.

jimyi
Actually, it's not the variations, but the fact that a mailing address is not regular. At least, I don't think it is. If someone can prove that it is, please do so - I'd be interested.
Thomas Owens
+4  A: 

Don't try. Somebody is likely to have a post office box, or an apartment number etc., and they will be really irate with you. Even a "normal" street name can have numbers, like 125th Street (and many others) in New York City. Even a suburb can have some numbered streets.

And city names can have spaces.

Robert L
I lived on a street named "East South Boulder Road". That is, the eastern portion of the street named "South Boulder Road". This was great fun to explain to people asking for my address.
Commodore Jaeger
In Australia there is a *town* called 1770. And to answer you next question, no, the postcode is 4677.
too much php
http://en.wikipedia.org/wiki/1770,_Queensland
Pekka
+1  A: 

Ask the user to enter parts of the address in separate fields (Street name, City, State, and Zip Code) and use whatever validation appropriate for such a field. This is the general practice.

Alternatively, if you want simplest of regex that matches for four strings separated by three commas, try this:

/^(.+),([^,]+),([^,]+),([^,]+)$/

If things match, you can use additional pattern matching to check components of the address. There is no possible way to check the street address validity but you might be able to text postal codes and state codes.

Salman A
A: 

This is not a bulletproof solution but the assumption is that an address begins with a numeric for the street number and ends with a zip code which can either be 5 or 9 numbers.

([0-9]{1,} [\s\S]*? [0-9]{5}(?:-[0-9]{4})?)

Like I said, it's not bulletproof, but I've used it with marginal success in the past.

JasonBartholme
A: 

Over here in New Zealand, you can license the official list of postal addresses from New Zealand Post - giving you the data needed to populate a table with every valid postal address in New Zealand.

Validating against this list is a whole lot easier than trying to come up with a Regex - and the accuracy is much much higher as well, as you end up with three cases:

  • The address you're validating is in the list, so you know it is a real address
  • The address you're validating is very similar to one in the list, so you know it is probably a real address
  • The address you're validating is not similar in the list, so it may or may not be real.

The best you'll get with a RegEx is

  • The address you're validating matches the regex, so it might be a real address
  • The address you're validating does not match the regex, so it might not be a real address

Needing to know postal addresses is a pretty common situation for many businesses, so I believe that licensing a list will be possible in most areas.

The only sticky bit will be pricing.

Bevan
A: 

Use CASS software like that at http://semaphorecorp.com/cgi/zp4.html

Standard format is documented in Publication 28 as usps.com

If you're validating for eventual dupe detection, read http://semaphorecorp.com/mpdd/mpdd.html

joe snyder