tags:

views:

1570

answers:

9

What are the down sides of using a composite/compound primary key?

A: 

Need more specificity.

Taken too far, it can overcomplicate Inserts (Every key MUST exist) and documentation and your joined reads could be suspect if incomplete.

Sometimes it can indicate a flawed data model (is a composite key REALLY what's described by the data?)

I don't believe there is a performance cost...it just can go really wrong really easily.

Tom Carr
well I would say performance costs might go up as you have to keep indexes for X number of elements of the primary key, but I just mention this for completeness.
Arthur Thomas
True, but on the insert/update not the reads.
Tom Carr
Well, the index will be larger, so yes, you will pay more for reads.
0124816
Actually, that reminds me. Some composite indexes only work if the first key is included in the index search (A, B). If you just use B then it has to do a full search without the index. Postgres is like this and it may also depend on index type but I am not sure. So it could affect reads. hehe :)
Arthur Thomas
+1  A: 

I would recommend a generated primary key in those cases with a unique not null constraint on the natural composite key.

If you use the natural key as primary then you will most likely have to reference both values in foreign key references to make sure you are identifying the correct record.

Arthur Thomas
+5  A: 
  1. Could cause more problems for normalisation (2NF, "Note that when a 1NF table has no composite candidate keys (candidate keys consisting of more than one attribute), the table is automatically in 2NF")
  2. More unnecessary data duplication. If your composite key consists of 3 columns, you will need to create the same 3 columns in every table, where it is used as a foreign key.
  3. Generally avoidable with the help of surrogate keys (read about their advantages and disadvantages)
  4. I can imagine a good scenario for composite key -- in a table representing a N:N relation, like Students - Classes, and the key in the intermediate table will be (StudentID, ClassID). But if you need to store more information about each pair (like a history of all marks of a student in a class) then you'll probably introduce a surrogate key.
Yacoder
surrogate key! I couldn't remember the word at the moment, thanks! :)
Arthur Thomas
A: 
  1. when you se it on a diagram are less readable
  2. when you use it on a query join are less readable
  3. when you use it on a foregein key you have to add a check constraint about all the attribute have to be null or not null (if only one is null the key is not checked)
  4. usualy need more storage when use it as foreign key
  5. some tool doesn't manage composite key
+1  A: 

There's nothing wrong with having a compound key per se, but a primary key should ideally be as small as possible (in terms of number of bytes required). If the primary key is long then this will cause non-clustered indexes to be bloated.

Bear in mind that the order of the columns in the primary key is important. The first column should be as selective as possible i.e. as 'unique' as possible. Searches on the first column will be able to seek, but searches just on the second column will have to scan, unless there is also a non-clustered index on the second column.

Matt Howells
+4  A: 

I think this is a specialisation of the synthetic key debate (whether to use meaningful keys or an arbitrary synthetic primary key). I come down almost completely on the synthetic key side of this debate for a number of reasons. These are a few of the more pertinent ones:

  • You have to keep dependent child tables on the end of a foriegn key up to date. If you change the the value of one of the primary key fields (which can happen - see below) you have to somehow change all of the dependent tables where their PK value includes these fields. This is a bit tricky because changing key values will invalidate FK relationships with child tables so you may (depending on the constraint validation options available on your platform) have to resort to tricks like copying the record to a new one and deleting the old records.

  • On a deep schema the keys can get quite wide - I've seen 8 columns once.

  • Changes in primary key values can be troublesome to identify in ETL processes loading off the system. The example I once had occasion to see was an MIS application extracting from an insurance underwriting system. On some occasions a policy entry would be re-used by the customer, changing the policy identifier. This was a part of the primary key of the table. When this happens the warehouse load is not aware of what the old value was so it cannot match the new data to it. The developer had to go searching through audit logs to identify the changed value.

Most of the issues with non-synthetic primary keys revolve around issues when PK values of records change. The most useful applications of non-synthetic values are where a database schema is intended to be used, such as an M.I.S. application where report writers are using the tables directly. In this case short values with fixed domains such as currency codes or dates might reasonably be placed directly on the table for convenience.

ConcernedOfTunbridgeWells
A: 

The main downside of using a compound primary key, is that you will confuse the hell out of typical ORM code generators.

David B
I asked this very question here on SO and a popular answer was that NHibernate plays very nicely with compound PKs (http://stackoverflow.com/questions/218100/orm-support-for-compound-primary-keys).
onedaywhen
A: 

To David B>> that's a major PLUS, not a downside ! ORM is bane and should be avoided like plague.

care to give any reasons?
Maximiliano Guzman
A: 

Take the example of a table with two candidate keys: one simple (single-column) and one compound (multi-column). Your question in that context seems to be, "What disadvantage may I suffer if I choose to promote one key to be 'primary' and I choose the compound key?"

First, consider whether you actually need to promote a key at all: "the very existence of the PRIMARY KEY in SQL seems to be an historical accident of some kind. According to author Chris Date the earliest incarnations of SQL didn't have any key constraints and PRIMARY KEY was only later addded to the SQL standards. The designers of the standard obviously took the term from E.F.Codd who invented it, even though Codd's original notion had been abandoned by that time! (Codd originally proposed that foreign keys must only reference one key - the primary key - but that idea was forgotten and ignored because it was widely recognised as a pointless limitation)." [source: David Portas' Blog: Down with Primary Keys?

Second, what criteria would you apply to choose which key in a table should be 'primary'? In SQL, the choice of key PRIMARY KEY is arbitrary and product specific. In ACE/Jet (a.k.a. MS Access) the two main and often competing factors is whether you want to use PRIMARY KEY to favour clustering on disk or whether you want the columns comprising the key to appears as bold in the 'Relationships' picture in the MS Access user interface; I'm in the minority by thinking that index strategy trumps pretty picture :) In SQL Server, you can specify the clustered index independently of the PRIMARY KEY and there seems to be no product-specific advantage afforded. The only remaining advantage seems to be the fact you can omit the columns of the PRIMARY KEY when creating a foreign key in SQL DDL, being a SQL-92 Standard behaviour and anyhow doesn't seem such a big deal to me (perhaps another one of the things they added to the Standard because it was a feature already widespread in SQL products?) So, it's not a case of looking for drawbacks, rather, you should be looking to see what advantage, if any, your SQL product gives the PRIMARY KEY. Put another way, the only drawback to choosing the wrong key is that you may be missing out on a given advantage.

Third, are you rather alluding to using an artificial/synthetic/surrogate key to implement in your physical model a candidate key from your logical model because you are concerned there will be performance penalties if you use the natural key in foreign keys and table joins? That's an entirely different question and largely depends on your 'religious' stance on the issue of natural keys in SQL.

onedaywhen