views:

4555

answers:

5

Summary: I'm developing a persistent Java web application, and I need to make sure that all resources I persist have globally unique identifiers to prevent duplicates.

The Fine Print:

  1. I'm not using an RDBMS, so I don't have any fancy sequence generators (such as the one provided by Oracle)
  2. I'd like it to be fast, preferably all in memory - I'd rather not have to open up a file and increment some value
  3. It needs to be thread safe (I'm anticipating that only one JVM at a time will need to generate IDs)
  4. There needs to be consistency across instantiations of the JVM. If the server shuts down and starts up, the ID generator shouldn't re-generate the same IDs it generated in previous instantiations (or at least the chance has to be really, really slim - I anticipate many millions of presisted resources)
  5. I have seen the examples in the EJB unique ID pattern article. They won't work for me (I'd rather not rely solely on System.currentTimeMillis() because we'll be persisting multiple resources per millisecond).
  6. I have looked at the answers proposed in this question. My concern about them is, what is the chance that I will get a duplicate ID over time? I'm intrigued by the suggestion to use java.util.UUID, but again, the chances of a duplicate need to be infinitesimally small.
  7. I'm using JDK6
+12  A: 

Pretty sure UUIDs are "good enough". There are 340,282,366,920,938,463,463,374,607,431,770,000,000 UUIDs available.

http://www.wilybeagle.com/guid_store/guid_explain.htm

"To put these numbers into perspective, one's annual risk of being hit by a meteorite is estimated to be one chance in 17 billion, that means the probability is about 0.00000000006 (6 × 10−11), equivalent to the odds of creating a few tens of trillions of UUIDs in a year and having one duplicate. In other words, only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%. The probability of one duplicate would be about 50% if every person on earth owns 600 million UUIDs"

http://en.wikipedia.org/wiki/Universally_Unique_Identifier

Shawn Miller
Nice reference! So its safe to infer that if I use UUID.randomUUID() in my application, the chance of it generating the same UUID twice is infinitesimally small then...?
Julie
yes, infinitesimally
Shawn Miller
Well, just because there are so many possible values, doesn't necessarily mean that they wrote the algorithm well enough to get a good random distribution. Then again, the designers of the UUID class have probably put a lot more thought into it than I could in an afternoon!
Julie
yep, quite a bit of thought went into it... "Standardized by the Open Software Foundation (OSF) as part of the Distributed Computing Environment (DCE)"
Shawn Miller
While you're up at night worrying about that duplicate UUID, be sure to keep an eye open for that meteorite... ;)
Michael Burr
+2  A: 

If it needs to be unique per PC: you could probably use (currentTimeMillis() << 4) | (staticCounter++ & 15) or something like that.

That would allow you to generate 16 per ms. If you need more, shift by 5 and and it with 31...

if it needs to be unique across multiple PCs, you should also combine in your primary network card's MAC address.

edit: to clarify

private static int staticCounter=0;
private final int nBits=4;
public long getUnique() {
    return (currentTimeMillis() << nBits) | (staticCounter++ & 2^nBits-1);
}

and change nBits to the square root of the largest number you should need to generate per ms.

It will eventually roll over. Probably 20 years or something with nBits at 4.

Bill K
That's a clever way to go about it. I think I'll trust the UUID class, as @smiller has given me more confidence that it's "unique enough."
Julie
My company uses a system very similar to this for generating our "UUIDs". It works ok (I've never seen a duplicate). It seems really hacky though, and also it allows you to figure out where and when something was created.
rmeador
It is hacky in that I was limiting myself to a long. If you simply used two longs and appended a count to the currentTime in a synchronized method, it won't fail unless your clock changes. If you are worried about that, it would be trivial to fix too.
Bill K
+2  A: 

Java UUID Generator also provides a good quality UUID generator. It can generate UUID's using method 1 (MAC address + time) which is good if you need furthur assurance that UUID's are unique between multiple machines. It includes a JNI library to retrieve the MAC address or a mechanism to specify/generate a random MAC.

For others who may not be using Java 6, it is also backwards compatible with older versions of Java that don't include UUID.

James Schek
Thanks, James - I'll take a look at this. I believe we've also met in real life :-) (see my profile if you don't remember who I am). Hope everything is going well with you!
Julie
Yes, we have! I wonder how many other former Labbie's lurk SO? :-)
James Schek
A: 

From memory the RMI remote packages contain a UUID generator. I don't know whether thats worth looking into.

When I've had to generate them I typically use a MD5 hashsum of the current date time, the user name and the IP address of the computer. Basically the idea is to take everything that you can find out about the computer/person and then generate a MD5 hash of this information.

It works really well and is incredibly fast (once you've initialised the MessageDigest for the first time).

Aidos
A: 
public class UniqueID {
    private static long startTime = System.currentTimeMillis();
    private static long id;

    public static synchronized String getUniqueID() {
        return "id." + startTime + "." + id++;
    }
}
Dave Griffiths
Nice and simple solution, yes. The UUID bit of code will probably be the most called in my application, by many threads at once, so the synchronization overhead/bottleneck may be too much.
Julie