views:

583

answers:

9

Storing a telehone number in some kind of canonical format has several advantages from a programmers point of view, but it might confuse the user, if suddenly his entered numbers look a lot different.

What's the way to go?

+16  A: 

Store it however you prefer, but turn it into human readable format before you show it to the user. And please don't force your users to enter phone numbers in a format of your choosing, let them just type it in however they like.

That's how I do it.

HeavyWave
+1: same thing for credit card numbers. I should be allowed to enter '1234 5678 9012 3456' or '1234-5678-9012-3456', and you should validate for consistent punctuation (and checksum and ...) and store it how the hell you like (without the punctuation). But forcing me to type it with no punctuation '123456798012345' is cruel - and making me behave like a computer, not making the computer behave like a human.
Jonathan Leffler
@MiseryIndex: I think Jonathan meant a single input box where both formats can be entered, i.e. not forcing any format at all.
DR
Agreed. Also, this is trivial with regular expressions. A [^0-9]+ replacement (to empty string) and subsequent cast to unsigned long will sort out ANY telephone or credit card number input formatting.
Robert Venables
(*Provided the user inputs in base 10, does not elect to use mnemonic phone number, and is not living in 1935)
Robert Venables
+4  A: 

The main difficulty in canonicalizing phone numbers is determining the correct canonical format. Different countries have different ways of grouping numbers - and within a country, different numbers can be grouped differently. It used (once upon a decade or more ago) to be that case that in the UK, you had 01-123-2345, 021-123-1234, 0334-123123, even 0913224-213; things are different in the UK now - generally more digits, and I'm not sure about the groupings any more (absence makes one's knowledge less current). Dealing with country prefixes and indicating the internal country dialling prefix is fun: +44 (0)1394-726629 is a UK number, country code 44; dialling from outside the UK, drop the 0; dialling inside the UK, do not include the international prefix but do include the 0.

This is similar to the problem of canonicalizing mail addresses - not as complex, but still bad.

Also, as noted in my comment to HeavyWave's answer, forcing people to enter the phone number as a digit string with no punctuation is nasty. It's fine to store it that way; just present the data in a human readable format. There's far too much lazy web form programming out there.

Jonathan Leffler
+3  A: 

The UK is a special case in that while the whole number is a consistent 11 digits long we have variable length STD (area) codes and actual number itself. The longer the STD code the shorter the number.

  • 020 1234 5678 (London and some other areas)
  • 0115 123 4567 (Nottingham) Sheffield, Leicester, Reading and Leeds also have this format
  • 01332 123 456 (Derby) most other areas
  • 07812 123 456 (mobile numbers)

Additionally some people like to group their numbers 123 123 while others use 12 12 12 (depending on the actual digits).

There are arguments for storing as entered and storing in a single form:

If you store the number as just the sequence of numbers then you can output it in any way you want, either by taking into account user preferences or their locale and splitting the number up according to "rules" (what ever they may be).

If you store as entered then you'll always display it as the user expects, but you'll need to strip out non numeric values before using it, which if it's often could be expensive.

ChrisF
+4  A: 

What's your userbase?

If they're going to be limited geographically (i.e., US-only) and you're going to validate numbers strictly, then format the number canonically for them -- i.e., strip out any formatting they used (like periods between numbers...) and put in the dashes (do not fail validation if they don't stick to your formatting... that's just mean). I'd store that cleaned-up version in the DB as well, not a stripped number; it makes your life a bit easier when generating custom reports, etc..

If you might have users/numbers from all over the world, it might be better to save the formatting they used. Also don't forget the case that sometimes US residents are currently traveling and using a foreign number: don't block them unintentionally.

Either way: make sure you DON'T define the column as numeric, or make it too small. International numbers with formatting can easily be over 16 chars long.

Rob Whelan
+2  A: 

I would keep the original entered mess but would also insert a cleaned up form in the database. Which only kept the numbers less punctuation and spaces. Using the cleaned form would allow easy lookups without worrying about different possible entered styles.

mP
A: 

Validate the input but allow wide array of formats. Store it as user typed it and then reformat the output as needed.

Let's say user typed his number during registration to public phonebook application. So I would display it 'as user typed it' in the textfield on his 'edit my profile' page, for example. But I would display it reformatted to standard format on the public user phonebook list.

Josef Sábl
+1  A: 

My gut instinct is to canonicalize according to the local standards of the entity, then render in the canonical representation modulus usability.

Paul Nathan
A: 

I usually like to store the number stripped and then format for display. Since I don't usually build application for use worldwide, I don't generally have to worry about the format. But in the case of an application for use all around the world, I would probably build a formatting module that formats according to the phone number's locale.

Jeff Hornby
+2  A: 

Separation of Duties - Content and Rendering

Store the number in a canonical format and the display format mask.

Gains:

  • Canonical format for consistency, quality, and ease of analysis
  • Format retained from end-user perspective
  • Format re-usable to display other phone numbers in end-user preferred method
  • Other format masks can be used to display canonical number to other users with a need to see the phone number

Pains:

  • Parsing the phone number to the canonical format
  • Parsing out the display format mask (not too painful in combination with above bullet)
  • Storing the display format as an end-user preference
SetProcessor