ansaurus

Question

Database Design: Composite key vs one column primary key

Answer 1

+1 A:

I recommend adding a new primary key field to the Cities table that will be simply auto-incremental. The KISS methodology (keep it simple).

Any other solution is cumbersome and confusing in my opinion.

Roee Adler 2009-08-12 04:46:20

Thnaks, but this doesn't help for our setup- it is important that the idAreaCode be unique.

Sam 2009-08-13 16:31:15

Answer 2

+2 A:

Having a composite key could be problematic when you want to reference that table, since the referring table would have to have all columns the primary key has.

If that's the case, you might want to have a sequence primary key, and have the idAreaCode and idStates defined in a UNIQUE NOT NULL group.

Aviad Ben Dov 2009-08-12 04:48:02

Thanks! Yours and Walter Mitty's suggestions seems the most practical.

Sam 2009-08-13 16:33:53

Answer 3

A:

Definitely use an identity column:

id int IDENTITY(1,1)

I'm not sure how many cities will eventually be in your database, but if you add a separate primary key, it may make sense to have it NONCLUSTERED for performance reasons.

If your db access pattern is usually through the state, I'd create a clustered index on the idStates so that you don't decrease in performance from your previously well indexed table.

jkelley 2009-08-12 04:54:13

Thanks, I'll look into your suggestion. (We currently have 2000+ cities in the db, which may double by next year).

Sam 2009-08-13 16:54:59

Answer 4

+1 A:

If you go with adding an additional column to the key so that you can add an additional record for a given city, then you're not properly normalizing your data. Given that you've now discovered that a city can be a member of multiple states, I would suggest removing any reference to a state from the Cities table, then adding a StateCity table that allows you to relate states to cities (creating a m:m relationship).

Adam Robinson 2009-08-12 05:19:55

This does appear to be the popular sentiment. Thanks.

Sam 2009-08-13 16:56:31

Answer 5

+3 A:

It sounds like you are gathering data for a telephone directory. Are you? Why are states important to you? The answer to this question will probably determine which database design will work best for you.

You may think that it's obvious what a city is. It's not. It depends on what you are going to do with the data. In the US, there is this unit called MSA (Metropolitan Statistical Area). The Kansas City MSA spans both Kansas City, Kansas and Kansas City, Missouri. Whether the MSA unit makes sense or not depends on the intended use of the data. If you used area codes in US to determine cities, you'd end up with a very different grouping than MSAs. Again, it depends on what you are going to do with the data.

In general whenever hierarchical patterns of political subdivisions break down, the most general solution is to consider the relationship many-to-many. You solve this problem the same way you solve other many-to-many problems. By creating a new table, with two foreign keys. In this case the foreign keys are IdAreacode and IdStates.

Now you can have one arecode in many states and one state spanning many area codes. It seems a shame to accpet this extra overhead to cover just one exception. Do you know whether the exception you have uncovered is just the tip of the iceberg, and there are many such exceptions?

Walter Mitty 2009-08-12 11:57:00

Thanks for posting. FYI - the database stores meteorological info of each city. The state table is important for (1) use in the mapping GUI (2) reducing query overhead when looking up a city (3) identifying a city with the same name in different state / country. So far this is the only unique situation I've come across where a city belongs to more than one state and it does seem to be a shame to add an extra table for this.

Sam 2009-08-13 16:28:37

What makes this exception a single city? Do both parts of the city have the same area code? Where do you get that data from?

Walter Mitty 2009-08-14 11:20:26

Sam 2009-08-14 14:45:59

Answer 6

+1 A:

Imtroduce a surrogate key. What are you going to do when area codes change numbets or get split? Using business keys as a primary key almost always is a mistake.

Your above summary is another example of why.

mP 2009-08-19 22:42:08

Answer 7

A:

I think it is best to add another table, countries. Your problem is an example why database normalization is important. You can't just mix and match different keys to one column.

So, I suggest you to create these table:

countries:

+------------+--------------+
| country_id | country_name |
+------------+--------------+

states:

+------------+----------+------------+
| country_id | state_id | state_name |
+------------+----------+------------+

cities

+------------+----------+---------+-----------+
| country_id | state_id | city_id | city_name |
+------------+----------+---------+-----------+

data

+------------+----------+---------+---------+----------+
| country_id | state_id | city_id | data_id | your_CSV |
+------------+----------+---------+---------+----------+

The bold fields are primary keys. Enter a standard country_id like 1 for US, 91 for india, and so on. city_id should also use their standard id.

You can then find anything belongs to each other pretty fast with minimal overhead. All data can then entered directly to data table, thus serving as one entry point, storing all the data into single spot. I don't know with mysql, but if your database support partitioning, you can partition data tables according to country_id or country_id+state_id to a couple of server arrays, thus it will also speed up your database performance considerably. The first, second, and third table won't take much hit on server load at all, and only serve as reference. You will mainly working on fourth data table. You can add data as much as you wish, without any duplicate ever again.

If you only have one data per city, you can omit data table and move CSV_data to cities table like this:

cities

+------------+----------+---------+-----------+----------+
| country_id | state_id | city_id | city_name | CSV_data |
+------------+----------+---------+-----------+----------+

Magician 2010-07-10 17:29:36

ansaurus

tags:

views:

answers:

Database Design: Composite key vs one column primary key

countries:

states:

cities

data

cities

related questions