tags:

views:

253

answers:

4

In our application we are creating Xml files with an attribute that has a Guid value. This value needed to be consistent between file upgrades. So even if everything else in the file changes, the guid value for the attribute should remain the same.

One obvious solution was to create a static dictionary with the filename and the Guids to be used for them. Then whenever we generate the file, we look up the dictionary for the filename and use the corresponding guid. But this is not feasible because we might scale to 100's of files and didnt want to maintain big list of guids.

So another approach was to make the Guid the same based on the path of the file. Since our file paths and application directory structure are unique, the Guid should be unique for that path. So each time we run an upgrade, the file gets the same guid based on its path. I found one cool way to generate such 'Deterministic Guids' (Thanks Elton Stoneman). It basically does this:

private Guid GetDeterministicGuid(string input) 

{ 

//use MD5 hash to get a 16-byte hash of the string: 

MD5CryptoServiceProvider provider = new MD5CryptoServiceProvider(); 

byte[] inputBytes = Encoding.Default.GetBytes(input); 

byte[] hashBytes = provider.ComputeHash(inputBytes); 

//generate a guid from the hash: 

Guid hashGuid = new Guid(hashBytes); 

return hashGuid; 

} 

So given a string, the Guid will always be the same.

Are there any other approaches or recommended ways to doing this? What are the pros or cons of that method?

+6  A: 

MD5 is weak, I believe you can do the same thing with SHA-1 and get better results.

BTW, just a personal opinion, dressing a md5 hash up as a GUID does not make it a good GUID. GUIDs by their very nature are non Deterministic. this feels like a cheat. Why not just call a spade a spade and just say its a string rendered hash of the input. you could do that by using this line, rather than the new guid line:

string stringHash = BitConverter.ToString(hashBytes)
ryber
Thanks for your input, but this still gives me a string, and I am looking for a GUID...
desigeek
Ok, call your hash a "GUID", problem solved. Or is the real problem that you *need* a `Guid` object?
sixlettervariables
i wish it were that simple.. :) but yes, i need a 'GUID' object
desigeek
A: 

I would mark @ryber's response as an answer, but that method does not give me a GUID, it gives me a string. I am looking to generate a GUID, even though it 'feels like a cheat' and the closest I have come to doing that is via the method I posted in my question.

desigeek
+2  A: 

You need to make a distinction between instances of the class Guid, and identifiers that are globally unique. A "deterministic guid" is actually a hash (as evidenced by your call to provider.ComputeHash). Hashes have a much higher chance of collisions (two different strings happening to produce the same hash) than Guid created via Guid.NewGuid.

So the problem with your approach is that you will have to be ok with the possibility that two different paths will produce the same GUID. If you need an identifier that's unique for any given path string, then the easiest thing to do is just use the string. If you need the string to be obscured from your users, encrypt it - you can use ROT13 or something more powerful...

Attempting to shoehorn something that isn't a pure GUID into the GUID datatype could lead to maintenance problems in future...

Rob Fonseca-Ensor
+1  A: 

As Rob mentions, your method doesn't generate a UUID, it generates a hash that looks like a UUID.

The RFC 4122 on UUIDs specifically allows for deterministic (name-based) UUIDs - Versions 3 and 5 use md5 and SHA1(respectively). Most people are probably familiar with version 4, which is random. Wikipedia gives a good overview of the versions. (Note that the use of the word 'version' here seems to describe a 'type' of UUID - version 5 doesn't supercede version 4).

There seem to be a few libraries out there for generating version 3/5 UUIDs, including the python uuid module, boost.uuid (C++) and OSSP UUID. (I haven't looked for any .net ones)

bacar