views:

1603

answers:

11

I'm looking for the ultimate postal code and zip code regex. I'm looking for something that will cover most (hopefully all) of the world.

+6  A: 

This looks like a good reference although it's not in Regex.

Really, unless you're actually shipping something to your users, I don't think it's worth the effort. And if you are shipping it, there are address cleaning tools/services you can look into to make it way easier on yourself.

Tom Ritter
Also, even if it is the correct zip code today it very well might change in the future. USPS is constantly adding new ones and splitting areas. The only way you can keep up is to validate at the time you are actually shipping something. Some towns even elect to change their own zip code for a variety of reasons.
Chris Lively
A: 

Given that there are so many edge cases for each country (eg. London addresses may use a slightly different format to the rest of the UK) I don't think that there is an ultimate regex other than maybe:

([0-9][a-z][A-Z])+

Best of going with a fairly broad pattern (well not quite as broad as the above), or treat each country/region with a specific pattern of its own!

UPDATE: However, it may be possible to dynamically construct a regex based upon lots of smaller, region specific rules - not sure about performance though!

Lots of country specific patterns can be found on the RegExLib site.

Macka
might want to throw a set of brackets around your character classes...
Gavin Miller
Good point - cheers!
Macka
+17  A: 

There is none.

Postal/zip codes around the world don't follow a common pattern. In some countries they are made up by numbers, in others they can be combinations of numbers an letters, some can contain spaces, others dots, the number of characters can vary from two to at least six...

What you could do (theoretically) is create a seperate regex for every country in the world, not recommendable IMO. But you would still be missing on the validation part: Zip code 12345 may exist, but 12346 not, maybe 12344 doesn't exist either. How do you check for that with a regex?

You can't.

Treb
12345 does exist. Schnectady, NY. (No, I don't live there, but a lot of web sites think I do...)
Dave Sherohman
I suspect that a regex could be compiled, but that a task like this be much better suited to a database. The regex would look something like 10000|10001|10002|10003|.......
Kibbee
+6  A: 

Depending on your application, you might want to implement regex matching for the countries where most of your visitors originate and no validation for the rest (accept anything).

GoodEnough
A: 

Why are you doing this and why do you care? As Tom Ritter pointed out, it doesn't matter whether you even have a ZIP/postal code at all, much less whether it's valid or not, until and unless you are actually going to be sending something to that address. Even if you expect that you will be sending them something someday, that doesn't mean you need a postal code today.

Dave Sherohman
Yeah but if they're going to be entering one, might as well make sure it's correct at that point. However, I agree with one of the other answers that basically says, make it validate for the countries that you think will be the majority of your customers.
cdmckay
+1  A: 

As noted elsewhere the variation around the world is huge. And even if something that matches the pattern does not mean it exists.

Then, of course, there are many places where postcodes are not used (e.g. much or Ireland).

Richard
Actually, probably all of Ireland, as I don't think D1, D2, etc. are considered proper post codes as you can't identify an address using just this code and a street number.
Don
+1  A: 

You have a problem. You write a regex for it.

Now you have two problems.

Robert S.
-1 :: Overused quote on SO
Gavin Miller
I think the most overused quote on SO is "not programming related."
Robert S.
Which of those problems does this answer solve?
Rob Kennedy
All three of them.
Robert S.
At least credit JWZ: http://regex.info/blog/2006-09-15/247 and the quote is 'Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.'
Chas. Owens
It has long since entered the collective consciousness.
Robert S.
+2  A: 

We use the following:

Canada

([A-Z]{1}[0-9]{1}){3}   //We raise to upper first

America

[0-9]{5}                //-or-
[0-9]{5}-[0-9]{4}       //10 digit zip

Other

Accept as is

Gavin Miller
I'd suggest adding an optional -[0-9]{4} to the US one. Some people do use their ZIP+4.
David Thornley
/[0-9]{5}(?:-[0-9]{4})?/ lets you validate both styles from the US at the same time.
Chas. Owens
+1  A: 

Trying to cover the whole world with one regular expression is not completely possible, and certainly not feasible or recommended.

Not to toot my own horn, but I've written some pretty thorough regular expressions which you may find helpful.

It is not possible to guarantee accuracy without actually mailing something to an address and having the person let you know when they receive it, but we can narrow things by down by eliminating cases that we know are bad.

Scott
+2  A: 

The problem is going to be that you probably have no good means of keeping up with the changing postal code requirements of countries on the other side of the globe and which you share no common languages. Unless you have a large enough budget to track this, you are almost certainly better off giving the responsibility of validating addresses to google or yahoo.

Both companies provide address lookup facuilities through a programmable API.

TokenMacGuy
+1  A: 
It is only a good idea until the code starts rejecting valid zipcodes either because it is buggy or the zipcodes have changed. Validation is something that must either be right or not there at all. At the very least there should be an override option.
Chas. Owens