views:

11574

answers:

4

Given the following class

public class Foo
{
    public int FooId { get; set; }
    public string FooName { get; set; }

    public override bool Equals(object obj)
    {
        Foo fooItem = obj as Foo;

        return fooItem.FooId == this.FooId;
    }

    public override int GetHashCode()
    {
        // Which is preferred?

        return base.GetHashCode();

        //return this.FooId.GetHashCode();
    }
}

I have overridden the Equals method because Foo represent a row for the Foos table. Which is the preferred method for overriding the GetHashCode?

Why is it important to override GetHashCode?

+70  A: 

Yes, it is important if your item will be used as a key in a dictionary, or HashSet<T>, etc - since this is used (in the absense of a custom IEqualityComparer<T>) to group items into buckets. If the hash-code for two items does not match, they may never be considered equal (Equals will simply never be called).

The GetHashCode() method should reflect the Equals logic; the rules are:

  • if two things are equal (Equals(...) == true) then they must return the same value for GetHashCode()
  • if the GetHashCode() is equal, it is not necessary for them to be the same; this is a collision, and Equals will be called to see if it is a real equality or not.

In this case, it looks like "return FooId;" is a suitable GetHashCode() implementation. If you are testing multiple properties, it is common to combine them using code like below, to reduce diagonal collisions (i.e. so that new Foo(3,5) has a different hash-code to new Foo(5,3)):

int hash = 13;
hash = (hash * 7) + field1.GetHashCode();
hash = (hash * 7) + field2.GethashCode();
...
return hash;

Oh - for convenience, you might also consider providing == and != operators when overriding Equals and GethashCode.


A demonstration of what happens when you get this wrong is here.

Marc Gravell
Can I ask ahy are you multiplying with such factors?
Leandro López
Actually, I could probably lose one of them; the point is to try to minimise the number of collisions - so that an object {1,0,0} has a different hash to {0,1,0} and {0,0,1} (if you see what I mean),
Marc Gravell
I tweaked the numbers to make it clearer (and added a seed). Some code uses different numbers - for example the C# compiler (for anonymous types) uses a seed of 0x51ed270b and a factor of -1521134295.
Marc Gravell
Thanks, now it's clearer.
Leandro López
BTW, congrats for your recent clearing of 100k!
RCIX
@Leandro López: Usually the factors are chosen to be prime numbers because it makes the number of collisions smaller.
Andrei Rinea
+2  A: 

It is because the framework requires that two objects that are the same must have the same hashcode. If you override the equals method to do a special comparison of two objects and the two objects are considered the same by the method, then the hash code of the two objects must also be the same. (Dictionaries and Hashtables rely on this principle).

Kevin
+3  A: 

By overriding Equals you're basically stating that you are the one who knows better how to compare two instances of a given type, so you're likely to be the best candidate to provide the best hash code.

This is an example of how ReSharper writes a GetHashCode() function for you:

    public override int GetHashCode()
    {
        unchecked
        {
            var result = 0;
            result = (result * 397) ^ m_someVar1
            result = (result * 397) ^ m_someVar2
            result = (result * 397) ^ m_someVar3
            result = (result * 397) ^ m_someVar4
            return result;
        }
    }

As you can see it just tries to guess a good hash code based on all the fields in the class, but since you know your object's domain or value ranges you could still provide a better one.

Trap
+15  A: 

It's actually very hard to implement GetHashCode() correctly because, in addition to the rules Marc already mentioned, the hash code should not change during the lifetime of an object. Therefore the fields which are used to calculate the hash code must be immutable.

I finally found a solution to this problem when I was working with NHibernate. My approach is to calculate the hash code from the ID of the object. The ID can only be set though the constructor so if you want to change the ID, which is very unlikely, you have to create a new object which has a new ID and therefore a new hash code. This approach works best with GUIDs because you can provide a parameterless constructor which randomly generates an ID.

Albic
i dont think it s very hard to implement has code. given these rules and better explained in effective c# book, i think overrident GetHashCode is rather easy.