tags:

views:

314

answers:

4

I wish to store UUIDs created using java.util.UUID in a HSQLDB database.

The obvious option is to simply store them as strings (in the code they will probably just be treated as such), i.e. varchar(36).

What other options should I consider for this, considering issues such as database size and query speed (neither of which are a huge concern due to the volume of data involved, but I would like to consider them at least)

A: 

I think the easiest thing to do would be to create your own domain thus creating your own UUID "type" (not really a type, but almost).

You also should consider the answer to this question (especially if you plan to use it instead of a "normal" primary key)

INT, BIGINT or UUID/GUID in HSQLDB?

HSQLDB: Domain Creation and Manipulation

jitter
+3  A: 
  1. I would recommend char(36) instead of varchar(36). Not sure about hsqldb, but in many DBMS char is a little faster.

  2. For lookups, if the DBMS is smart, then you can use an integer value to "get closer" to your UUID.

For example, add an int column to your table as well as the char(36). When you insert into your table, insert the uuid.hashCode() into the int column. Then your searches can be like this

WHERE intCol = ? and uuid = ?

As I said, if hsqldb is smart like mysql or sql server, it will narrow the search by the intCol and then only compare at most a few values by the uuid. We use this trick to search through million+ record tables by string, and it is essentially as fast as an integer lookup.

karoberts
I like that idea. Don't think we'll need it here due to the volume of records involved, but I'll definitely remember it for the future! Thanks.
William
+2  A: 

You have a few options:

  • Store it as a VARCHAR(36), as you already have suggested. This will take 36 bytes (288 bits) of storage per UUID, not counting overhead.
  • Store each UUID in two BIGINT columns, one for the least-significant bits and one for the most-significant bits; use UUID#getLeastSignificantBits() and UUID#getMostSignificantBits() to grab each part and store it appropriately. This will take 128 bits of storage per UUID, not counting any overhead.
  • Store each UUID as an OBJECT; this stores it as the binary serialized version of the UUID class. I have no idea how much space this takes up; I'd have to run a test to see what the default serialized form of a Java UUID is.

The upsides and downsides of each approach is based on how you're passing the UUIDs around your app -- if you're passing them around as their string-equivalents, then the downside of requiring double the storage capacity for the VARCHAR(36) approach is probably outweighed by not having to convert them each time you do a DB query or update. If you're passing them around as native UUIDs, then the BIGINT method probably is pretty low-overhead.

Oh, and it's nice that you're looking to consider speed and storage space issues, but as many better than me have said, it's also good that you recognize that these might not be critically important given the amount of data your app will be storing and maintaining. As always, micro-optimization for the sake of performance is only important if not doing so leads to unacceptable cost or performance. Otherwise, these two issues -- the storage space of the UUIDs, and the time it takes to maintain and query them in the DB -- are reasonably low-importance given the cheap cost of storage and the ability of DB indices to make your life much easier. :)

delfuego
Umm...in what universe does 36 * 8 = 256? 36 * 8 = 288 in this one :P
MetroidFan2002
Alas, I apparently live in my own universe. :/ I'll edit it.
delfuego
A: 

Using BINARY(16) is another possibility. Less storage space than character types. Use CREATE TYPE UUID .. or CREATE DOMAIN UUID .. as suggested above.

fredt