views:

457

answers:

9

Why is the rejection of composite keys in favor of all tables using a single primary key named id. Cause generally all ORM follow this..???

EDIT

I just started learning ruby on rails and in the book of agile develepment by pragmatic there is a line:--- Rails really doesn’t work too well unless each table has a numeric primary key. It is less fussy about the name of the column. Same kind of line i read when i was learning Doctrine.

EDIT2 please chk link too.. I m getting more and more confused about this thing:--- http://stackoverflow.com/questions/159087/composite-primary-keys-versus-unique-object-id-field

From the above link:--

*the primary key should be constant and meaningless; non-surrogate keys usually fail one or both requirements, eventually

if the key is not constant, you have a future update issue that can get quite complicated if the key is not meaningless, then it is more likely to change, i.e. not be constant; see above

take a simple, common example: a table of Inventory items. It may be tempting to make the item number (sku number, barcode, part code, or whatever) the primary key, but then a year later all the item numbers change and you're left with a very messy update-the-whole-database problem...

EDIT: there's an additional issue that is more practical than philosophical. In many cases you're going to find a particular row somehow, then later update it or find it again (or both). With composite keys there is more data to keep track of and more contraints in the WHERE clause for the re-find or update (or delete). It is also possible that one of the key segments may have changed in the meantime!. With a surrogate key, there is always only one value to retain (the surrogate ID) and by definition it cannot change, which simplifies the situation significantly.*

+1  A: 
  1. For an ORM a single identifying column with a consistent name like table_id is easier than a composite key. But every good ORM support composite keys.

  2. An simple PK can easily be 'autoincremented' by the database. This doesn't hold for a composite key.

  3. A simple PK is also easier to use in queries. When you need to join, you only have to use one column of both relations.

This is not to say that simple PKs are better than composite ones, though.

Exception e
+3  A: 

You can use both. In some cases when making an association between an entity you can use both entity keys as a composite key.

As a rule of thumb I use generated ids for entities and composite keys for relationships.

James Westgate
+6  A: 

I don't think there is a blanket statement that you should only ever use a single primary key named id.

Most people use a surrogate primary key as an auto generate int, because it isolates the primary key from ever needing to be changed, like if you make the PK the user name and they later changed their legal name. You would have to update the PK and all FK columns to reflect the new name. if you had used a surrogate primary key, you just update the user's name in one spot (because the tables join on the int not the name).

The size of a primary key is important because the PK is duplicated into every index you build on the table. If the PK is large (like a string) you have fewer keys per page in the index and the index will take more cache memory to store it. Ints are small.

Having a auto increment int PK lends itself to being a clustered index well, as rows are stored in this order and there is no need to go back and bump rows out of the way to insert a new row, you allays add to the table's end.

KM
Sequential guids work really well too, especially when you need to merge data sets from multiple databases into a single table.
Juliet
@Juliet, but they are large and waste index space. If possible it is best to combine a auto increment int ID with another column with identifies the database/company/etc (like a 1 or two byte int) where that row came from because it can use much less space than the guid, yet still be unique across all databases.
KM
I wouldnt recommend the use of guids except for edge cases
James Westgate
@KMI just started learning ruby on rails and in the book of agile develepment by pragmatic there is a line:---Rails really doesn’t work too well unless each table has a numeric primary key. It is less fussy about the name of the column.Same kind of line i read when i was learning Doctrine.
piemesons
A: 

Objects in OO programming have identity regardless of their contents. Rows (tuples) in relational databases are identified by their contents only. So when really doing ORM, i.e. mapping objects from an object-oriented programming language to a relational database, an extra ID must be provided as distinct from the fields and/or properties the object has in the program - unless one or more of those are somehow known to identify objects uniquely.

reinierpost
You certainly require a *primary key* on your table, but there is no reason this has to be a single column. Composite primary keys are very very common, especially on junction tables.
Iain Galloway
Please read again. I didn't say primary keys are required in relational database schemas (they aren't), nor that composite keys do not arise in designing relational database schemas (they do). I just wanted to say that if your relational database model is blindly *generated* from an OO class model, all tables will need to have surrogate IDs as their primary keys that do not correspond to any of the properties of the object classes. Junction tables will not be generated unless you treat collection-valued properties in a special way.
reinierpost
Do you have a special way of representing a m:m relationship without using junction tables?
Iain Galloway
Can you give an example of an m:n relationship in a class model?
reinierpost
High-level example: interface IGroupable<T> where T:IGroup { ICollection<T> Groups { get; } } interface IGroup<T> where T:IGroupable { ICollection<T> Items { get; } }
Iain Galloway
OK, thanks. So if you treat collection-valued properties in a special way, you will generate junction tables for them. Now please read again: "Junction tables will not be generated unless you treat collection-valued properties in a special way." Something serious seems to be blocking our communication here.
reinierpost
Sure, I'm missing you somewhere. You say I'm treating collections "specially", but I don't understand what your alternative is. How would you treat "collection-valued properties" such as those I suggested in such a way as to not require a junction table?
Iain Galloway
If you don't treat them specially, and translate them in a generic way, you won't end up with junction tables. The generic translation will translate each class as a table with an ID column plus one column for each field, so the exact columns will depend on the collection's internal implementation, e.g. a linked list will have Current and Next columns, an array Index and Value. The ID is used as the property's value.
reinierpost
+2  A: 

Well, it's basically about keeping JOINs simple - which one is simpler to understand:

SELECT
   p.ID, p.Name, p.City,
   c.ID, c.Country, c.ISOCode
FROM
   dbo.Parent p
INNER JOIN
   dbo.Child c on c.ParentID = p.ID

or

SELECT
   p.ID, p.Name, p.City,
   c.ID, c.Country, c.ISOCode
FROM
   dbo.Parent p
INNER JOIN
   dbo.Child c ON c.ParentName = p.Name
     AND c.ParentCity = p.City
     AND c.ParentCountry = p.Country

If you have composite primary keys, anyone joining to your table from a child table must "drag along" all those columns, and all those columns are also going to be present in the child table and the JOIN statements are pretty messy. Much better to have a single (even surrogate) key for the JOIN!

marc_s
Would you recommend using a surrogate key (rather than a composite key) as the PK for a junction table representing a many-many relationship? If so, why?
Iain Galloway
@Iain Galloway: not necessarily - junction tables for many-to-many relationships are one of the exceptional cases where you'd probably use a composite PK to your advantage. But there's no harm (except for extra columns) in adding a surrogate PK even in this case - it would allow you to more easily manage (e.g. delete) entries from that junction table
marc_s
+1  A: 

The only real limitation that I have run into using composite keys regards using an IN expression with a subquery. This is a problem, because a subquery in an IN expression must return a single column (at least in T-SQL).

SELECT
    emp.Name,
    emp.UserDomain,
    emp.UserID
FROM
    employee emp
WHERE
    ???? IN (SELECT e.UserDomain, e.UserID FROM ... /* some complex 
                                                       non-correlated subquery 
                                                       or CTE */
            )

There are always work-arounds, of course, but sometimes it could be an annoyance.

This is hardly a reason to avoid a composite key in places where it makes sense to use one.

Jeffrey L Whitledge
A: 

Your question is strongly related to the surrogate (or artificial) keys vs natural keys alternative. I think it's not that composite keys are less used, but that natural keys (be them composite or simple) are less favoured than artificial keys.

Traditional relational database theory dealt mostly with "natural" keys, (the ones which have meaning from the business-domain point of view) and in that scenario composite keys are frequently found... naturally.

But in the later years, database design has favoured (though not exclusively) the "artificial" (surrogate) key pattern, typically a sequential number that has no business meaning, only serves to uniquely identifies the record in the table (and perhaps the object in the upper layer).

leonbloy
Composite keys are still found "naturally" when using surrogate keys. Surrogate vs Natural keys is a completely separate debate!
Iain Galloway
Read the question: "Why is the rejection of composite keys in favor of all tables using a single primary key named id."The 'single primary key named id' is precisely a surrogate key.
leonbloy
I read the question carefully. He doesn't mention natural keys at all. The debates (natural vs surrogate and simple vs composite) are completely separate. Composite keys frequently occur in the absence of natural keys. For a relevant example see e.g. http://megocode3.wordpress.com/2008/01/04/understanding-a-sql-junction-table/ Can you think of a good reason to give that junction table a surrogate key of its own?
Iain Galloway
First, the surrogate vs natural "debate" IS certainly relevant for the question (take a look at the accepted answer) and is mandatory reading for the poster. Second, a joint table like that of your example certainly has a "natural" (in both senses of the word) PK, it does not need a surrogate key at all; however, the fact is that nowadays many people have the (good or bad) habit of "universal surrogate PK design", using an "id" PK (surrogate) for ALL tables (as the question notes); so that, even in your example, the "natural" key would not be used as a PK - just a UNIQUE restriction.
leonbloy
Certainly I have seen a movement towards using an autonumber PK for all tables, including junction tables. The OP has seen this too and is asking *why*.
Iain Galloway
A: 

I worked on an app with a 11 column primary key. It was always great fun retyping the list over and over and over every time I wanted to guarantee I was updating one row. It was a driver of bugs, MS-Access couldn't cope with more than 10 columns in a PK, etc.

Large composite keys are design smells that mean the table holds heterogenous entities or the designer wasn't really sure what it is that is unique about each entity. (Like assuming that hair color, eye color and body weight should be enough to unique identify an employee-- which isn't a good key because you'd need more and more and more columns to make it work and eventually that will include fields that are volatile and change a lot, like weight, or for some people hair color or lack there of.)

MatthewMartin
A: 

Although I agree with most of the reasons given by other respondents, my primary reason for preferring a single-column integer key is that it makes writing a user interface much, much easier.

If you are using some kind of list control to represent your data (a list, list view, combo box, etc) you can uniquely relate each entry back to its database representation through a single integer value stored with the item. Most pre-written components already allow you to attach an integer to each item and for those that don't, it's very easy to extend the component to do so.

If you're passing data between a server application and a web page, it's much easier to store the single identifying value in the the id attribute of the widget that represents the data than to have to compose and parse multi-value ids.

Larry Lustig