views:

423

answers:

2

I have a list of n GUIDs and I need to hash them into a single value. This value could be the size of a Guid object or the size of an Int32, it doesn't really matter, but it does need to be statistically unique (say with a probably similar to MD5).

So one approach could be to sort them, concatenate the bytes and take an MD5 hash of all the bytes... but this isn't very quick.

Another idea: I notice that it is fairly standard practice in .NET to implement the GetHashCode method of a composing object as the XOR of the hash codes of the composed objects. Therefore could it be mathematically sensible to XOR my list of GUIDs?

Any ideas welcome!

+1  A: 

If you want the hash to be valid for the set (i.e. order doesn't matter) then XORing the hashcode of each GUID is a good choice.

If you've actually got a sequence of GUIDs and the order matters then I'd suggest using the same approach I wrote about in another answer - repeatedly add/multiply.

(Note that XORing the hashcodes probably won't get you the same answer as XORing the GUIDs themselves and then hashing the result. It may be, but that depends on the implementation of GUID.GetHashCode(). I'd hash each value and XOR the results together - aside from anything else, that's trivial to implement.)

Jon Skeet
Reflector says the implementation of Guid.GetHashCode() is: return ((this._a ^ ((this._b << 0x10) | ((ushort) this._c))) ^ ((this._f << 0x18) | this._k));Interesting that it doesn't use all of the available information. I guess therefore XORing the GUIDs would produce a more unique result.
James Thurley
A: 

Don't XOR the GUIDs and then hash the result. You gain nothing this way over simply XORing the GUIDs, unless you use a hash smaller than a GUID.

Since you seem to really care about performance for this, a little more information would be useful -- in particular, are you using different combinations of the GUIDs you have in memory (so you could hash them only once as they're created), or are you loading them in and processing them, and repeated GUIDs are unlikely?

Peter Crabtree