views:

50

answers:

2

I was playing around with Dictionary and stumbled across the below scenario

public class MyObject
{
    public string I { get; set; }
    public string J { get; set; }
    public string K { get; set; }

    public override int GetHashCode()
    {
        int hashCode = (I+J+K).GetHashCode();
        Debugger.Log(9, "INFO", hashCode.ToString() + System.Environment.NewLine);
        return hashCode;
    }
}
class Program
{
    static void Main(string[] args)
    {
        MyObject obj1 = new MyObject() { I = "Hello", J = "World" };
        MyObject obj2 = new MyObject() { I = "Hello", J = "World" };

        Dictionary<MyObject, string> collection = new Dictionary<MyObject, string>();
        collection.Add(obj1, "1");
        var result = collection[obj2]; // KeyNotFound exception here.
    }
}

I have MyObject class which acts as a key to dictionary and I override the GetHashCode method to return hash code based on the values stored in the class.

So, when the above code is executed, both obj1 and obj2 returns same hash code, but still the dictionary throws KeyNotFound exception.

Any reason why such a behavior?

+2  A: 

You need to override Object.Equals.

Dictionary<TKey, TValue> and other hash-based collections view hash-equality as a necessary but insufficient condition for full-equality because of the possibility of hash-collisions. In your sample, the key-getter finds the right hash-bucket to search in and even considers obj1 as a candidate for full equality, but because the default implementation of Equals is based on reference-equality, it is rejected.

Ideally, implement IEquatable<T> on your class:

public class MyObject : IEquatable<MyObject>
{
    public string I { get; set; }
    public string J { get; set; }
    public string K { get; set; }

    public override int GetHashCode()
    {
        // you might want to consider a better hash-function here.
        return (I + J + K).GetHashCode();
    }

    public override bool Equals(object obj)
    {
        return base.Equals(obj as MyObject);
    }

    public bool Equals(MyObject other)
    {
        return other != null && other.I == I && other.J == J && other.K == K;
    }
}

Also bear in mind that the hash of a key-object must not change as long as it is present in the dictionary.

Ani
+4  A: 

In .NET, GetHashCode is used in concert with the Equals method to determine object equality with regard to storage in collections.

Note that a hash table is more complex than simply mapping a key to a single slot via a hash code. Due to the nature of hash codes, collisions can occur and do occur in practice (though with a good hash function this should not be very often). Thus most hash table implementations have to deal with the case of two different objects generating the same hash code and this is often achieved with a linked list at each "slot" in the hash table. The hash code is used to determine the slot and the Equals method is used to determine whereabouts the object is stored in the linked list (in most "standard" implementations of a hash table).

A word of warning, however: there are very few good reasons to override the built-in behaviour of GetHashCode. I found this interesting SO thread discussing GetHashCode and Equals that should be worth a read: http://stackoverflow.com/questions/371328/why-is-it-important-to-override-gethashcode-when-equals-method-is-overriden-in-c. It discusses the merit/demerits of changing the behaviour, the properties of good and bad hash functions, the required properties of these two methods and other goodies.

Richard Cook
+1 if the hashcodes are equal then the `Equals` method is called. The hashcodes really just prefilter because the comparison is much quicker than `Equals`.
Kirk Broadhurst
Makes sense. As the dictionary and hash table return one and only one value for a given key and throws an exception if a key already exist, it will not have linked list at each slot in the hash table.
Ramesh
@kirk - Thanks for that info. Very informative.
Ramesh