views:

98

answers:

4

Almost every time that I build an application, I find myself creating an 'address' table to contain addresses for the system.

What columns do you create within your 'address' table (or tables if you normalize addresses over more than one) and what is your reasoning behind it?

+4  A: 
  • line 1
  • line 2
  • line 3
  • locality
  • region
  • postcode
  • country (being a foreign key into a country table)

Excluding country, all of these fields are text (varchar).

This assumes that the recipient is stored elsewhere.

It can also be appropriate to, say, lose one of the lines and replace it with:

  • street number
  • street name

as that can help with validation, finding the address via Web Services and so on.

cletus
Have you ever considered aggregating the 'lines' into a single text column and including newlines? Is there a reason that you favor this design?
Josiah
The reason for multiple lines is that the user will put in their address 99.9% of the time how to correctly format it if you were going to mail them something, which is usually the format you want it in.
cletus
@Josiah: cletus recommendation is sound. An address can compromise of up to 4 optional lines: c/o line, RR (rural route), street, suite for example.
OMG Ponies
@rexem: I agree that this is a sound recommendation, however I think that by having a single 'text' column that contains the lines separated by newlines allows for greater flexibility (for example should a user require to have 4 lines of address). How this information is displayed to the user is not important, and can be displayed as 3 lines. Its a lot easier to change a form than to change the database and all the refactoring associated with that.
Josiah
+2  A: 

I'm going to toss in an "it depends" answer. If your application is just concerned with layout of properly printed address labels, a set of lines (line1, line2, line3, etc.) could be sufficient. A text blob might work as well, depending on how the data is entered. Give the user a box and let them type? Make them fit it in 3, 4, 5 lines? Whatever.

However, if you want to be able to "do stuff" with the data, such as sort by zipcode, analyze distribution by city, state, and/or country, or track how many digits in in your street addresses (10 Main St. vs. 54321 Main St.), then you'd want separate columns for each significant piece of information.

Seems likely the requirements wil be "include space for an address", and decisions regarding what actually gets done with the addresses will come up later... at which time they'll want to be able to sort/count/exponentiate/whatever, even if they will never actually do it. As per the link(s) referenced in other posts, once you go international it can get very complex indeed. I'd say try and keep it as simple as reasonable, where "reasonable" depends on the business reasoning behind the requirements.

Philip Kelley
A: 

My Two Cents

I have completed the design of the table that I was creating for this particular project, and went with an approach that combines simplicity with support for internationalization and coordinates.

  • lines: Contains the detailed lines of the address, lines are separated by newlines
  • locality: The locality (suburb/town/city) in which this address is located
  • region: The region (state/provence) in which this address is located
  • postal_code: The postal code in which this address is located
  • country: The country in which this address is located
  • accuracy: Indicates the accuracy of the coordinates within this address
  • lattitude: The lattitude coordinate for this address
  • longitude: The longitude coordinate for this address

As Philip Kelly pointed out, the design of the table is not just about input but what operations you want to perform on the data.

I plan to use the coordinates at a later time to pinpoint address locations on a map, additionally this provides location data in a more 'computer friendly' format. The address lines are designed to be how the address would appear for an end user or on a mailing label and I don't need to "do stuff" with this data.

Josiah
+1  A: 

In an ideal world every application would store addresses in a canonical format such as UPU or BS7666. It is considerably easier to amalgamate a structured address into a single string for printing than it is to subsequently extract address elements from a single blob of text. Because sooner or later somebody, maybe not you, will want to "do stuff" with the address information. Data warehouses are very fashionable these days.

Unfortunately, implementing something like BS7666 usually requires address validation software. It is unreasonable to ask a regular user to fit a address to a procrustean format: we cannot expect they will understand the difference between a locality, a town and a post town. And it is misleading to tout something as being in a standard format when it hasn't been validated as such.

But go for some form of structure. Also, at least validate the postcode/zip with regex to ensure it is in the right format.

APC