views:

1275

answers:

6

I expect the column to be a VARCHAR2, in my Oracle Database.

US Zips are 9.

Canadian is 7.

I am thinking 32 characters would be reasonable upper limit

What am I missing?

+1  A: 

Canadian Postal Codes are only 6 characters, in the form of letter's and numbers (LNLNLN)

tegbains
Canadian postal codes have a blank in the middle "ANA NAN" Thats 7 characters.
EvilTeach
But the space is always in the middle so you don't need to store it.
Graeme Perrow
@EvilTeach - yes, but you can expect data to be normalized before being stored
ysth
The space could be used to identify it from other postal code types. It'd be faster to store it in its presentation form so it is consistent with all other postal codes in the table, too. No need to regexp to "denormalize" the code.
strager
The space doesn't seem to be a part of the data: "Note: Canadian postal codes are always formatted in the same sequence: alphabetic character / numeral / alpha /numeral / alpha / numeral (e.g. K1A0B1)." That's from the Canada Post website.
tegbains
@strager: I think it would be better to base the Postal Code type on the Country instead of what is entered by the user as the Postal Code. You could use a regex based on the Country to verify the Postal Code entry by the user.
tegbains
+8  A: 

Skimming through Wikipedia's Postal Codes page, 32 characters should be more than enough. I would say even 16 characters is good.

strager
Good link. Even allowing for the punctuation in US ZIP+4, 10 characters would be enough for any country as far as I could tell.
Jonathan Leffler
+10  A: 

Here you can find a list of International Postal Codes formats.

CMS
Useful link, however it's accuracy may be a bit out. EG it lists Australian postcodes as being 7 characters, when in fact they're 4. Ref: http://en.wikipedia.org/wiki/Postcodes_in_Australia and the postcode list available at http://www1.auspost.com.au/postcodes/.
rossp
re: my previous comment - that doesn't mean this list isn't useful as a guide. Assuming the list errs on the side of longer postcodes, the longest length is 9 characters so 16 characters or thereabouts should give you plenty of room to breathe.
rossp
+1 for the nice find.
strager
+2  A: 

What you're missing is a reason why you need the postal code to be handled specially.

If you don't really need to WORK with a postal code, I would suggest not worrying about it. By work, I mean do special processing for rather than just use to print address labels and so on.

Simply create three or four address fields of VARCHAR2(50) [for example] and let the user input whatever they want.

Do you really need to group your orders or transactions by postcode? I think not, since different countries have vastly different schemes for this field.

paxdiablo
I agree. Using a VARCHAR2 field the reality is for a field like postcode it really doesnt matter. Slightly too big is better than annoying one customer because they cant input their details.
Toby Allen
And varchars are handy since databases (at least DB2) may optimize storage of them, so as to not waste storage space.
paxdiablo
one would point out that sorting by country and postal code will result in cheaper postal rates in some places.
EvilTeach
@EvilTeach, how so? Surely you have to send stuff from A to B, I can't immediately see how sort order from a database affects this (maybe that's just me, of course).
paxdiablo
Disgaree. Sometime down the line you'll decide you'll need to validate the addresses in your database (eg to correct typographical and data entry errors) and that's when you'll find the benefit of properly constructing your data model rather than just shoving everything in buckets.
Gary
@Igor: Then that would be the missing reason to do it - but that reason doesn't exist NOW, so you're wasting time and money catering for the requirement.
paxdiablo
@igor here is a link about discount rates. http://bulkmail.info/presort.html
EvilTeach
@Pax If you hand over bulk mail to the Royal Mail presorted by the head district (first letter/two letters) of the postcode, then you can have it delivered by MailSort, which is cheaper than regular second class mail. That's just one example.
Richard Gadsden
A: 

Normalization? Postal codes might be used more than once, and might be related to street names or town names. Separate table(s).

Stephan Eggermont
+1  A: 

Why would you declare a field size larger than the actual data you are expecting to store in it?

If the initial version of your application is going to support US and Canadian addresses (which I'm inferring from the fact that you call out those sizes in your question), I'd declare the field as VARCHAR2(9) (or VARCHAR2(10) if you intend to store the hyphen in ZIP+4 fields). Even looking at the posts others have made to postal codes across countries, VARCHAR2(9) or VARCHAR2(10) would be sufficient for the most if not all other countries.

Down the line, you can always ALTER the column to increase the length should the need arise. But it is generally hard to prevent someone, somewhere from deciding to get "creative" and stuff 50 characters into a VARCHAR2(50) field for one reason or another (i.e. because they want another line on a shipping label). You also have to deal with testing the boundary cases (will every application that displays a ZIP handle 50 characters?). And with the fact that when clients are retrieving data from the database, they are generally allocating memory based on the maximum size of the data that will be fetched, not the actual length of a given row. Probably not a huge deal in this specific case, but 40 bytes per row could be a decent chunk of RAM for some situations.

As an aside, you might also consider storing (at least for US addresses) the ZIP code and the +4 extension separately. It is generally useful to be able to generate reports by geographical region, and you may frequently want to put everything in a ZIP code together rather than breaking it down by the +4 extension. At that point, it's useful to not have to try to SUBSTR out the first 5 characters for the ZIP code.

Justin Cave
Well, assuming we are coding in something silly like Pro*C, having the field large enough for growth means the code won't need to be touched should the usage increase.
EvilTeach
Yes breaking the us zip code into 5 and 4 digits can make sense, depending on what you plan on using it for.For example, if you are doing some sort of address matching, you might want to match on the zip5 first, and resolve ambigueous situations with the zip 9.It also helps to use a country code
EvilTeach