views:

401

answers:

8

What are the disadvantages of using GUIDs? Why not always use them by default?

+11  A: 

Integers join a lot faster, for one. This is especially important when dealing with millions of rows.

For two, GUIDs take up more space than integers do. Again, very important when dealing with millions of rows.

For three, GUIDs sometimes take different formats that can cause hiccups in applications, etc. An integer is an integer, through and through.

A more in depth look can be found here, and on Jeff's blog.

Eric
+8  A: 
  1. GUIDs are four times larger than an int, and twice as large as a bigint.

  2. GUIDs are really difficult to look at if you are trying to troubleshoot tables.

Robert Harvey
+1  A: 

GUIDs are big and slow compared to ints - so use them when they're needed, eschew them when they're NOT needed, it's as simple as that!

Alex Martelli
+3  A: 

GUIDS can simplify generating keys ahead of time, or generating keys offline, or in a cluster, without risk of collision. There may also be a slight security benefit, with all keys being unguessable.

The disadvantage is that it's harder to read/type and on many of your tables you may later realize a need to go back and generate human-friendly keys anyways. They'll also evenly distribute your records in a table, which may make it slower to query multiple records that were inserted at around the same time vs having an autonumber key where your records are in order of time inserted.

David
A: 

The biggest performance hit you'll see with GUIDs as a primary/clustered key is inserting records in large tables. It can be a heavy task to reindex since your key will fall somewhere in the middle

jayrdub
+4  A: 

Kimberly L. Tripp: GUIDs as PRIMARY KEYs and/or the clustering key

Have you read the related links on the right hand side?

gbn
Good article...
Robert Harvey
+5  A: 

GUIDs are great from a programmer's perspective - they're guaranteed to be (almost) unique, so why not use them everywhere, right?

If you look at it from the DBA perspective and from the database standpoint, at least for SQL Server, there are a few things to consider:

  • GUIDs as primary key (which is responsible for uniquely identifying a single row in your table) might be okay - after all, they're unique, right?
  • however, SQL Server also has the concept of the clustering key, which physically orders the data in your table; if you don't know about this, and don't do anything explicitly, your primary key becomes your clustering key.

Kimberly Tripp - world-known expert on SQL Server indexing and performance - has a great many blog posts on why a GUID as your clustering key is a really bad idea - check out her blog on indexes.

Most notably, her best practices for a clustering key are:

  • narrow
  • static
  • unique
  • ever-increasing

GUIDs are typically static and unique - but they're neither narrow (16 byte vs. 4 byte for a INT) nor ever-increasing. Due to their nature, they're unique and (pseudo-)random.

The narrow part is important because the clustering key will be added to each and every index page for each and every non-clustered index on your table - and if you have a few of those, and a few million rows in your table, this amounts to a massive waste of space - and not just on your disk, but also in your SQL Server's RAM.

The ever-increasing part is immportant, because the randomness of the GUIDs causes a lot of fragmentation in your indices, which negatively affects your performance all around. Even the newsequentialid() of SQL Server 2005 and up doesn't really create sequential GUIDs all around - they're sequential for a while and then there's a jump again, causing fragmentation (less than totally random GUIDs, but still).

So all in all, if you're really concerned with your SQL Server performance, using GUIDs as a clustering key is a really bad idea - use INT IDENTITY() instead, possibly using a GUID as the primary (non-clustered) key if you really have to.

Marc

marc_s
A: 

Using GUIDs as a primary key will eventually lead to your database crashing because the drive becomes too fragmented. This is a condition known as thrashing.

Chris Love