views:

451

answers:

5

I've been investigating the use of GUIDs as primary keys in databases. So far, the pros seem to outweigh the cons. However, I see one point where GUIDs may not be what I want.

In my application, users should be able to identify objects based on a user-friendly ID. So, for example, if they want to get a specific product without typing in the full name, they can use the product's ID. GUIDs aren't easy to remember for something like that.

The solution I've been thinking about is to use both a GUID and an auto-incrementing integer. The GUID would be the row's primary key, while the auto-incrementing integer would be an index used by the application's filtering functions. All SQL SELECT, UPDATE, DELETE statements would use the GUID, however.

The main reason I want to use GUIDs is to prevent clashes when merging two databases. If Database #1 and Database #2 both have a Product #2, the importer script would have to change the ID and all foreign keys referring to it. With GUIDs, I only have to change the user-friendly ID in the table itself, while foreign keys would use the GUID unique to each imported record and will therefore work without modification.

So, my question is: are there any major problems (besides the size of the GUID field and easy page fragmentation) with having an auto-incrementing integer index and a GUID primary key?

+8  A: 

I always tend to use surrogate primary keys in my database. That is: those primary keys have no actual meaning in the problem domain, and thus, those primary keys are never exposed to users. (If this surrogate primary key is of type GUID or an identity, I don't care; this depends on the requirements).

If you say that users should be able to identify objects based on a user-friendly ID, then, I think that this user-friendly ID is a value that belongs to your 'problem domain'. This means, that this ID should indeed be an attribute in your table, but it should not be used as the primary key in your table.

This also allows you to easily modify the value of such an user-friendly ID (if that should be necessary), without you having to worry about modifying related foreign keys as well.

Frederik Gheysels
A: 

In MySQL, you'll need to set your numeric ID as a PRIMARY KEY, as AUTO_INCREMENT may be only the PRIMARY KEY, which means it should also be NOT NULL.

You can still define a UNIQUE INDEX on your GUID column and use it anywhere, though an InnoDB table will be clustered on the numeric id, not on the GUID.

Quassnoi
Sorry, I forgot to mention that I'm using SQL Server 2008.
David Brown
A: 

There is a school of thought out there that says you should never expose your surrogate ID's to the outside world. So they'd say if you want a business ID, you should use something else for it.

This Wikipedia article, for example, says this:

Disassociation

The values of generated surrogate keys - because they are generated and arbitrary - have no relationship to the real-world meaning of the data held in a row. When inspecting another row holding a foreign key reference to a surrogate key, it is not possible to work out the meaning of it holding that reference simply by looking at the data in the row itself. A layer is added to this indirection for each foreign key join that one must navigate while attempting to make sense of a data item. This can also make auditing more difficult, as incorrect data is not obvious on inspection.

Surrogate keys are also not natural for data that is exported and shared. A particular difficulty is that two instances of a schema can hold records which logically mean the same thing (that is - they are the same in a business sense), but which have a different key owing to the history of how the keys were assigned. An approach to dealing with this is to adopt the rule that surrogate keys are never exported or imported: they are never exposed outside the database except as transient data (most obviously, in executing applications that have a "live" connection to the database).

Charlie Flowers
+1  A: 

"Why do "users should be able to identify objects based on a user-friendly ID" ?

In my opinion, your users should itentify records using codes.

Let's say your database contains products (as you mentionned it in Question). Wouldn't it be better if they had codes to represent products, that the users could enter.

Let's say you have tables and chairs, as a user, i would prefer using tbl and chr than 1 and 2 to identify what I am talking about.

Martin
Look up functions are performed in each module. So, if a user is attempting to look up an ID in the "Products" module, the application already knows they want a product.
David Brown
A: 

To be more specific about your question, yes there are other problems with using GUIDs as primary keys in databases:

http://www.sqlskills.com/BLOGS/KIMBERLY/post/GUIDs-as-PRIMARY-KEYs-andor-the-clustering-key.aspx

The problem is not so much with using a GUID as primary key, its using a non-sequential GUID as the clustered index for a table.

The takeaway here is to either use other fields as the clustered index, or use a sequential GUID to avoid this fragmentation.

BradC