views:

32

answers:

2

Currently I'm using the following class as my key for a Dictionary collection of objects that are unique by ColumnID and a nullable SubGroupID:

public class ColumnDataKey
{
    public int ColumnID { get; private set; }
    public int? SubGroupID { get; private set; }

    // ...

    public override int GetHashCode()
    {
        var hashKey = this.ColumnID + "_" + 
            (this.SubGroupID.HasValue ? this.SubGroupID.Value.ToString() : "NULL");
        return hashKey.GetHashCode();
    }
}

I was thinking of somehow combining this to a 64-bit integer but I'm not sure how to deal with null SubGroupIDs. This is as far as I got, but it isn't valid as SubGroupID can be zero:

var hashKey = (long)this.ColumnID << 32 + 
    (this.SubGroupID.HasValue ? this.SubGroupID.Value : 0);
return hashKey.GetHashCode();

Any ideas?

+1  A: 

Strictly speaking you won't be able to combine these perfectly because logically int? has 33 bits of information (32 bits for the integer and a further bit indicating whether the value is present or not). Your non-nullable int has a further 32 bits of information making 65 bits in total but a long only has 64 bits.

If you can safely restrict the value range of either of the ints to only 31 bits then you'd be able to pack them roughly as you're already doing. However, you won't get any advantage over doing it that way - you might as well just calculate the hash code directly like this (with thanks to Resharper's boilerplate code genberation):

public override int GetHashCode()
{
    unchecked
    {
        return (ColumnID*397) ^ (SubGroupID.HasValue ? SubGroupID.Value : 0);
    }
}
Daniel Renshaw
I see, good answer! I'm looking into the range of what `SubGroupID` could be - it looks initially like it can never be below zero so I hope this means I can strip it down a bit and use the final bit to mean `null`?
Codesleuth
This works like a charm! I changed the zero to -11,111,111 as Matt suggested, but I've used basically what you've answered. Thank you!
Codesleuth
+1  A: 

You seem to be thinking of GetHashCode as a unique key. It isn't. HashCodes are 32-bit integers, and are not meant to be unique, only well-distributed across the 32 bit space to minimise the probability of collision. Try this for your GetHashCode method of ColumnDataKey:

ColumnID * 397 ^ (SubGroupID.HasValue ?? SubGroupID.Value : -11111111)

The magic numbers here are 397, a prime number, which for reasons of voodoo magic is a good number to multiply by to mix up your bits (and is the number the ReSharper team picked), and -11111111, a SubGroup ID which I assume is unlikely to arise in practice.

Matt Howells
It seems not even -1 will come up in practice. I will try combining your answer with Daniel's.
Codesleuth