tags:

views:

1218

answers:

5

I know the standard GUI ids. Can they be made shorter? What is the theory behind it?

+2  A: 

They are exactly 16 bytes.

Technically speaking the effect of shortening them will vary based on the algorithm used to generate them. Considering, the API you used (probably) doesn't guarantee a particular version or implementation, it's a bad idea to shorten them. Even if it did, it's a bad idea. If you require less than 16 bytes of entropy, you should prob not be using a GUID.

For more information: http://en.wikipedia.org/wiki/Globally_Unique_Identifier

Greg Dean
+6  A: 

Greg Dean's answer is correct but in order to understand how a GUID is generated and why it ought not to be shortened I would highly suggest you read the article below.

The Old New Thing : GUIDs are globally unique, but substrings of GUIDs aren't:

A customer needed to generate an 8-byte unique value, and their initial idea was to generate a GUID and throw away the second half, keeping the first eight bytes. They wanted to know if this was a good idea.

No, it's not a good idea.

The GUID generation algorithm relies on the fact that it has all 16 bytes to use to establish uniqueness, and if you throw away half of it, you lose the uniqueness.

Andrew Hare
XORing the two halves might actually be OK (no worse than randomly generating the 8 bytes) if all one needed was, say, less than billions of such "statistically unique" values (by the sqrt-N rule of thumb that I mention in my answer).
Alex Martelli
Downvoters: Reason for the downvote?
Andrew Hare
+1 I'm not sure why this got down votes, this is very true (I've actually experience it before).
Zifre
Apparently all responses to this question are getting downvotes -- somebody must really dislike GUIDs, or something;-).
Alex Martelli
+2  A: 

The shorter "allegedly globally unique" IDs are, the higher the chance of a collision when many of them are more-or-less-randomly generated -- and, that chance's probably higher than you'd think, due to the "birthday paradox"... see http://betterexplained.com/articles/understanding-the-birthday-paradox/ . As a (very approximate but useful) rule of thumb, the chance is non-negligible if (among N possible UIDs) you assign sqrt(N) or so. A 128-bit ID is therefore pretty safe from accidental collision, even for many billions of IDs; but if you were to shorten it to, say, 32 bits, you'd have substantial risk of collisions even for just a few tens of thousands of IDs.

Alex Martelli
A: 

Put very simply, GUIDs are guaranteed to be unique because they act like coordinates.

Traditionally*, one half was specific to the machine (by using the MAC address) and one half was derived from the time.

Because MACs are unique between machines and each machine can execute one instruction at a time (traditionally remember!) the GUID will definitely be unique.

This means however, that if you ditch any part of a GUID, you lose the guarantee of uniqueness. Mr. Martelli gives a good explanation of why this is more of a problem than you might assume.

*I say traditionally, but I've never read an article that indicates a major change. I don't think actual MACs are used nowadays (for security reasons I guess), but I think they're still MAC-derived or at least machine specific.

Tom Wright
Reason for downvote? I'm all ears to constructive criticism...
Tom Wright
Only V1 GUIDs are coordinates (MAC+Time), and even those aren't "guaranteed to be unique".
Greg Dean
Not that I downvoted but, also, "each machine can execute one instruction at a time" isn't a reason why GUIDs may or may not be unique.
ChrisW
http://en.wikipedia.org/wiki/Universally_Unique_Identifier#Version_4_.28random.29
Greg Dean
Thanks guys, really ought to have checked wikipedia I guess.
Tom Wright
+2  A: 

Depending on your application, it really comes down to how big your "G" is in your application.

"GUID", Global Unique ID. Typical modern "generic" GUIDs are designed for any application, and their "G", their, "Global", is literally that. Global. World wide. Across applications, nations, geography, everything. 16 Bytes is a LOT of information.

Now, if IN YOUR APPLICATION, your "G" isn't that big, if you have not expectation, or intention of "G" being Global in a World sense, rather than simply "global" in the application space sense, then you can readily reduce the size to the scope of your application.

Got four divisions in your company, never going to have more? 2 bits -- 0, 1, 2, 3 is a big enough "GUID" for this task. Clearly, this is a contrived application.

We've learned in the past the consequence of "limiting Bits" when we slaved through the Y2K problems. So, "bits are cheap" is a valid enough reason to NOT limit your GUID size, and to err on the side of "too many bits for now". But, truth be told, may applications simply ARE limited, many applications may well generate a lot of a data, or be bandwidth constrained to where there is no need for a 16 byte GUID, and using one impacts performance and resources.

So, understand the concept of the GUID, and how it applies to your applications. Then you can make it any size necessary.

Will Hartung