views:

451

answers:

9

For a while, I've been told by a number of people that a listing of U.S. States (and territories) should be stored in a database table and cached for applications that use the information. The only reasons they give me for this is to promote normalization and because "it's how we've always done it".

Now if the list changes often because the scope of the application grows internationally (say to include Canadian Provinces), I can understand abstracting the list to a data table that would also indicate a country identifier as well. However, if the list is pretty much locked and only used on 1 screen of the application, is it worth doing the query and caching? Is the difference between storing an SMALLINT foreign key that much better than a CHAR(2)? Is it always practical?

Was just pondering this trend that I've seen with the companies that I've worked with.

A: 

You get to use foreign keys in the database.

Jason Punyon
How is this a benefit if the states table would have no other information? It would just cause a join when quering related information and that's what I'm wondering if it would be justified. Obviously the join against 50 or so records is minor but still something. Really focusing on the caching and querying aspects at the moment as well.
JamesEggers
@JamesEggers: Because at anytime someone (even you) can come along and insert a record in the database that won't map to your state enumeration (exception). I was arguing the use of foreign keys on a data integrity basis.
Jason Punyon
+4  A: 

If you want to attach more information such as state-population or economic output to the primary information then an FK makes this expansion (and subsequent queries) simpler and more extensible

When you can normalize -- Why not!

Aiden Bell
A: 

We might add a 51st state.

kmarsh
Geez, the other guy mentions NAFTA and gets a +1 for his additional states idea!
kmarsh
+1  A: 

Was there really a benefit to use 4 digits for the year in a date? It did not seem likely in this seventies app in COBOL to survive till y2k.

What if Bush's plans to the North American Free Trade Union (NAFTA) and it's currency union with the Amero is not the last step and BC, Ontario etc.become states?

Ralph Rickenbach
Or New York, Pennsylvania, etc. become provinces? :-)
Graeme Perrow
That for sure would be the better way to go. Just wait another view months and bailouts, and it will be the more probable too.
Ralph Rickenbach
A: 

if later you want to add US territories you just insert into the table and tada!

Also as Aiden said if you want more info attached to those records, it will be easier to build off of when its in the DB than if its hardcoded

ErsatzRyan
A: 

I say no. If it's a single application that you will be maintaining and you aren't doing anything more complex than filtering a query by state, you can just as soon add/remove states from a hard-coded array. However, if you are doing a lot of complex joins involving states as keys, then you probably would want to put them in a table.

jlleblanc
+2  A: 

A few reasons you might want to choose the database solution over the hard-coded one.

Extensibility reasons:

  • Programmatically adding records (although new states are not very likely to appear within the life-span of your application, but users sometimes have strange needs). This might even be left to the users of your application through a dedicated screen. If the values are hard-coded, you'll have to deliver a new version of your software.
  • Adding information to records. For instance, you might be asked to add a user-editable population field.

Integrity reasons:

  • Foreign keys will give you integrity constraints: you're sure to never add a record with a wrong state code if you have a foreign key. Note that you might use the codes as keys, which would save you the joins problem.

Still the choice depends on the size, number of users and life-span of your application. The bigger those figures, the more you should consider the database solution.

Steph
+6  A: 

I put them in the database, for a few reasons:

  1. It's just good normalization practice.

  2. Why should I hardcode any data in my application if I can avoid that by putting it in the database?

  3. Personally, I like the comfort of the guaranteed uniqueness property. When two tables point to the same foreign key, I know that they are referencning the same thing. When two happen to share the same two character code... well, then it's up to the application.

  4. Constraint Checks.... lots of two character codes are illegal, but the database won't be able to help you with that. It's a lot harder to have an illegal foreign key entry.

  5. Speed. I haven't benchmarked, so I may be wrong, but I bet having the database combine entries for you is faster than you doing it yourself. If you are using the state code for anything else (e.g., to get the non-abbreviated name of the state, or to maintain a list of valid zipcodes for that state ), a join is probably faster than whatever you would write.

Chris Arguin
+1  A: 

I completely agree that it is unlikely enough that the states list will change in our lifetime that you can hard code the list.

But a reason to put it in a database would be so you could link the state abbreviation with the full state name. Then you can use the state abbreviation as the unique key and do joins on it. This would be worthwhile normalization.

I've never heard a compelling reason not to use the state abbreviation as a unique key, even though it goes against "proper" database design. Just like I've never seen a database normalized to the fifth level that had any sort of performance. Although fifth normal form is a neat and educational exercise.