Storing a telehone number in some kind of canonical format has several advantages from a programmers point of view, but it might confuse the user, if suddenly his entered numbers look a lot different.
What's the way to go?
Storing a telehone number in some kind of canonical format has several advantages from a programmers point of view, but it might confuse the user, if suddenly his entered numbers look a lot different.
What's the way to go?
Store it however you prefer, but turn it into human readable format before you show it to the user. And please don't force your users to enter phone numbers in a format of your choosing, let them just type it in however they like.
That's how I do it.
The main difficulty in canonicalizing phone numbers is determining the correct canonical format. Different countries have different ways of grouping numbers - and within a country, different numbers can be grouped differently. It used (once upon a decade or more ago) to be that case that in the UK, you had 01-123-2345, 021-123-1234, 0334-123123, even 0913224-213; things are different in the UK now - generally more digits, and I'm not sure about the groupings any more (absence makes one's knowledge less current). Dealing with country prefixes and indicating the internal country dialling prefix is fun: +44 (0)1394-726629 is a UK number, country code 44; dialling from outside the UK, drop the 0; dialling inside the UK, do not include the international prefix but do include the 0.
This is similar to the problem of canonicalizing mail addresses - not as complex, but still bad.
Also, as noted in my comment to HeavyWave's answer, forcing people to enter the phone number as a digit string with no punctuation is nasty. It's fine to store it that way; just present the data in a human readable format. There's far too much lazy web form programming out there.
The UK is a special case in that while the whole number is a consistent 11 digits long we have variable length STD (area) codes and actual number itself. The longer the STD code the shorter the number.
Additionally some people like to group their numbers 123 123 while others use 12 12 12 (depending on the actual digits).
There are arguments for storing as entered and storing in a single form:
If you store the number as just the sequence of numbers then you can output it in any way you want, either by taking into account user preferences or their locale and splitting the number up according to "rules" (what ever they may be).
If you store as entered then you'll always display it as the user expects, but you'll need to strip out non numeric values before using it, which if it's often could be expensive.
What's your userbase?
If they're going to be limited geographically (i.e., US-only) and you're going to validate numbers strictly, then format the number canonically for them -- i.e., strip out any formatting they used (like periods between numbers...) and put in the dashes (do not fail validation if they don't stick to your formatting... that's just mean). I'd store that cleaned-up version in the DB as well, not a stripped number; it makes your life a bit easier when generating custom reports, etc..
If you might have users/numbers from all over the world, it might be better to save the formatting they used. Also don't forget the case that sometimes US residents are currently traveling and using a foreign number: don't block them unintentionally.
Either way: make sure you DON'T define the column as numeric, or make it too small. International numbers with formatting can easily be over 16 chars long.
I would keep the original entered mess but would also insert a cleaned up form in the database. Which only kept the numbers less punctuation and spaces. Using the cleaned form would allow easy lookups without worrying about different possible entered styles.
Validate the input but allow wide array of formats. Store it as user typed it and then reformat the output as needed.
Let's say user typed his number during registration to public phonebook application. So I would display it 'as user typed it' in the textfield on his 'edit my profile' page, for example. But I would display it reformatted to standard format on the public user phonebook list.
My gut instinct is to canonicalize according to the local standards of the entity, then render in the canonical representation modulus usability.
I usually like to store the number stripped and then format for display. Since I don't usually build application for use worldwide, I don't generally have to worry about the format. But in the case of an application for use all around the world, I would probably build a formatting module that formats according to the phone number's locale.
Store the number in a canonical format and the display format mask.
Gains:
Pains: