A colleague of mine was wondering why he couldn't just strip the hyphens from the uuid/guid before storing it. We couldn't work out what the hyphens were for...
What is the reasoning behind them? Surely they'd make it less random?
A colleague of mine was wondering why he couldn't just strip the hyphens from the uuid/guid before storing it. We couldn't work out what the hyphens were for...
What is the reasoning behind them? Surely they'd make it less random?
That's just for convenience. GUID consists of 16 bytes which makes up 32 characters in hex text representation. Without hyphens GUIDs are harder to perceive by humans and harder to be recognized as GUIDs and not some random nature 16-byte numbers.
Hyphens denote the byte structure of a Guid.
typedef struct _GUID
{
DWORD Data1;
WORD Data2;
WORD Data3;
BYTE Data4[8];
} GUID;
For:
(XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX)
You can probably strip them before saving. At least in .NET the constructor of the Guid type will initialize a Guid variable from its string representation regardless of whether the hyphens are still there or removed.
If you want to store a guid somewhere, then store it as an array of 16 bytes, not as its textual representation. You will save a lot of space, and the question of hyphens will not arise.
The hyphens are used to separate each number
E93416C5-9377-4A1D-8390-7E57D439C9E7
Hex digits Description
8 Data1
4 Data2
4 Data3
4 Initial two bytes from Data4
12 Remaining six bytes from Data4
The GUID is really just a number. The hyphens show you how the various components are broken down but aren't really part of the number. It's like an IP address - you can store a 32-bit number, or you can store a string with dots in it, they are equivalent.
The hypens have avsolutely no effect on the uniqueness or randomness of the value. They are merely a holdover from the definition of a GUID and visually separate the four distinct parts of data that make up the GUID.
You can get your guid in various formats.
Assuming you're using c#:
Guid guid = Guid.NewGuid();
Console.WriteLine(guid.ToString("N"))
63be6f7e4e564f0580229f958f492077
Console.WriteLine(guid.ToString("D"))
63be6f7e-4e56-4f05-8022-9f958f492077
Console.WriteLine(guid.ToString("B"))
{63be6f7e-4e56-4f05-8022-9f958f492077}
Console.WriteLine(guid.ToString("P"))
(63be6f7e-4e56-4f05-8022-9f958f492077)
In the original incarnation of the UUID specification each of the data elements had a meaning:
time_low - time_mid - time_high_and_version - clock_seq_and_reserved - clock_seq_low - node_id (MAC Address)
These elements were meant originally to provide temporal and spatial uniqueness. In the latest versions of the UUID spec these data elements no longer have any specific meaning, for various reasons (security, privacy), except for the version bits and the reserved bits.
Version 3 UUIDs are derived from an MD5 hash of a URI or other Distinguished Name, Version 4 is generated with random data and Version 5 is derived from a SHA1 hash.
So these hyphens are part of the historical data format of the original UUID spec. and are not necessary to provide entropy in any of the versions. In fact UUIDs are sometimes stored as a base64 or ascii85 encoded string to save space (though binary storage is most space efficient):
Ascii: 3F2504E0-4F89-11D3-9A0C-0305E82C3301 Base64: 7QDBkvCA1+B9K/U0vrQx1A Ascii85: 5:$Hj:Pf\4RLB9%kU\Lj