views:

1681

answers:

1

For our next project, we're considering to switch to UUIDs as primary keys for database tables. The project will run on HSQLDB, Derby, DB2 and Oracle, maybe MySQL. Data will not have to be migrated between databases (often), so the "uniqueness" over many DBs is not so much an issue for us.

The main advantage for us currently is that each client can generate UUIDs and put them into the persisted objects which saves a round-trip during object generation. Also, having the client's MAC address and the time stamp in the UUID might help when hunting an error (since we could see where an object came from, i.e. in which log file to look).

Unfortunately, none of these databases supports a native UUID type. Oracle uses RAW(16), in DB2 and Derby, I can use CHAR(16) FOR BIT DATA, HSQLDB has BINARY(16).

Our current concerns are:

  1. Handling of the binary data when manually selecting rows in an SQL tool
  2. No more select * (without writing our own SQL tool)
  3. Tables will be twice the size

Can someone confirm our concerns? Has anyone ever had to implement UUIDs across a multitude of different databases? What were your experiences?

+1  A: 

There are plenty of issues to consider when your PK is a UUID or a IDENTITY so you'll have to look deeper into this but on the surface:

  1. if you use a UUID as the PK then you have to consider the "extents" as they apply to the physical layout of the records on the disk. As the UUID may or maynot appear to be random values; when the pages fill they will have to be rebalanced. This might be better implemented as a unique-key instead of primary key. (extents = wasted disk space too)
  2. if you decide to use an IDENTITY value then you you don't have to worry about extents, however, there are other issues with the balancing of the index trees. There should be a lot less wasted space here. [updated] there may be race conditions here and it is possible to develop holes in the numbering of there are rollbacks.

Anyway, there are a number of options for the actual layout of the tables. You'll have to read up to get the exact setup you want.

[update 2009.05.18] Building a PK based on something like date() can have even worse consequences. I just asked: "is there a strategy for creating proper database indexes?"

Richard
+1 for bringing up new issues :) Any suggestion where I can find more information about this?
Aaron Digulla
Most reference manuals discuss the topics in a loose "if ... then..." style, however, quantifying the impact has always been an exercise for the reader. Which I've experienced with both Oracle and SQL Server.
Richard