views:

63

answers:

3

Lets say you are dealing with your normal contact database (you know... name, phone number, address, email, etc...). If you're concerened about this locally, it's generally not a big issue to deal with, but when we look at international sets it is.

Looking at the phone number system, you would think it's simple, but it's really not. In north america, we generally have 1-222-333-4444 format for calling people. This is of course divieded down into your international dialing code, area code, exchange prefix and line number. Problem: real phone numbers are limited, there are around 220 area codes in the US out of the potential 1000, each area code only has a limited number of exchanges, and the line numbers are restricted to specific use under that country (for example, patterns with 911 are restricted, only about 3/4ths of the 10,000 are in use). Take this over to the UK, they have their own set of rules for line numbers, such as reserving most of the 0300-0399 block to specific use, and other restrictions. International codes are also limited. Normalizing area codes, exchanges, and putting data validation checks onto phone numbers just got complicated. I'm not going into detail about when we go into places that are not part of the NPA scheme, but lets just identify that we can't really trust the north american template, kick back, and call it a day.

How do we normalize for things like this? How do we validate data? How do we deal with these seemingly ad-hoc extension codes or instructions for internal dialing?

International addresses are not much better, the differences between not just data retained, but also output formats aren't the same across the board. How do we deal with international postal codes, when in canada the format is A1A1A1, and the USA has a system such as 55555[-4444]?

I'm tempted to just write classes for each of these situations as I encounter them, store them in the database as XML/JSON/similar, but then how do I relate fields and easily search my content? I don't want to end up creating a table moster of thousands of tables for each country. I want an easily scalable solution where I can normalize my addresses and validate the content. Is this too much to ask?