tags:

views:

602

answers:

2

I understand the differences between the two from the docs.

uuid1():
Generate a UUID from a host ID, sequence number, and the current time

uuid4():
Generate a random UUID.

So uuid1 uses machine/sequence/time info to generate a UUID. What are the pros and cons of using each?

I know uuid1() can have privacy concerns, since it's based off of machine-information. I wonder if there's any more subtle when choosing one or the other. I just use uuid4() right now, since it's a completely random UUID. But I wonder if I should be using uuid1 to lessen the risk of collisions.

Basically, I'm looking for people's tips for best-practices on using one vs. the other. Thanks!

+6  A: 

uuid1() is guaranteed to not produce any collisions. I wouldn't use it if it's important that there's no connection between the uuid and the computer.

uuid4() generates, as you said, a random UUID. The chance of a collision is really, really, really small. Small enough, that you shouldn't worry about it. The problem is, that a bad random-number generator makes it more likely to have collisions.

This excellent answer by Bob Aman sums it up nicely. (I recommend reading the whole answer.)

Frankly, in a single application space without malicious actors, the extinction of all life on earth will occur long before you have a collision, even on a version 4 UUID, even if you're generating quite a few UUIDs per second.

Georg
Sorry, I commented without researching fully - there are bits reserved to keep a version 4 uuid from colliding with a version 1 uuid. I will remove my original comment. See http://tools.ietf.org/html/rfc4122
Mark Ransom
@gsYeah, makes sense with what I was reading. uuid1 is "more unique", while uuid4 is more anonymous. So basically use uuid1 unless you have a reason not to.@mark ransom:Awesome answer, didn't come up when I searched for uuid1/uuid4. Straight from the horse's mouth, it seems.
Rocketmonkeys
+2  A: 

One instance when you may consider uuid1() rather than uuid4() is when UUIDs are produced on separate machines, for example when multiple online transactions are process on several machines for scaling purposes.

In such a situation, the risks of having collisions due to poor choices in the way the pseudo-random number generators are initialized, for example, and also the potentially higher numbers of UUIDs produced render more likely the possibility of creating duplicate IDs.

Another interest of uuid1(), in that case is that the machine where each GUID was initially produced is implicitly recorded (in the "node" part of UUID). This and the time info, may help if only with debugging.

mjv