views:

4456

answers:

7

I have a class that contains the following two properties:

    public int Id      { get; private set; }
    public T[] Values  { get; private set; }

I have made it IEquatable<T> and overriden the object.Equals like this:

    public override bool Equals(object obj)
    {
        return Equals(obj as SimpleTableRow<T>);
    }
    public bool Equals(SimpleTableRow<T> other)
    {
        // Check for null
        if(ReferenceEquals(other, null))
            return false;

        // Check for same reference
        if(ReferenceEquals(this, other))
            return true;

        // Check for same Id and same Values
        return Id == other.Id && Values.SequenceEqual(other.Values);
    }

When having override object.Equals I must also override GetHashCode of course. But what do I put in it? How do I create a hashcode out of a generic array? And how do i combine it with the Id integer?

    public override int GetHashCode()
    {
        return // What?
    }
+1  A: 
public override int GetHashCode()    {
        return Id.GetHashCode() ^ Values.GetHashCode();  
}


There are several good points in the comments and other answers. The OP should consider whether the Values would be used as part of the "key" if the object were used as a key in a dictionary. If so, then they should be part of the hash code, otherwise, not.

On the other hand, I'm not sure why the GetHashCode method should mirror SequenceEqual. It's meant to compute an index into a hash table, not to be the complete determinant of equality. If there are many hash table collisions using the algorithm above, and if they differ in the sequence of the Values, then an algorithm should be chosen that takes sequence into account. If sequence doesn't really matter, save the time and don't take it into account.

John Saunders
I also don't think arrays have GetHashCode implemented taking into account all elements
Grzenio
That will do a reference comparison on Values, and will not be compatible with SequenceEqual (i.e. for different arrays with the same contents)
Marc Gravell
Guys, I've already said it before, but *be careful* using all of the elements of an exposed array. The result of GetHashCode() should be same for the lifetime of the object, otherwise it won't work as a hashtable key. There's no guarantee that this array won't change, so don't use it in GetHashCode!
Dustin Campbell
@Dustin: Good clarification. That's what I meant when I said "if the object is to be used as a key". Such objects may not change in a way that would change their hash code or equality, while they are acting as a key.
John Saunders
@John - such points are very valid, and well raised: however, posting a GetHashCode() implementation that is incompatible with the posted Equals() is *very wrong*, and could lead to a lot of problems - lost data, etc.
Marc Gravell
@Marc: can you post a URL that says the two implementations need to be equivalent (and that defines equivalent)? Although the purposes are similar, they are not identical. Surely Equals compares non-"key" fields. As long as two equal objects have equal hash code? Where's the problem?
John Saunders
http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx"If two objects compare as equal, the GetHashCode method for each object must return the same value." - where "compare as equal" means "Equals()"
Marc Gravell
http://stackoverflow.com/questions/371328/why-is-it-important-to-override-gethashcode-when-equals-method-is-overriden-in-c/371348#371348
Marc Gravell
Note that SequenceEqual (in the posted Equals) will treat two different arrays with the same *contents* as equal; but they will have different hash codes, hence your code will not generate valid hash codes.
Marc Gravell
Or for a demonstration: http://stackoverflow.com/questions/638761/c-gethashcode-override-of-object-containing-generic-array/639098#639098
Marc Gravell
A: 

I would do it this way:

long result = Id.GetHashCode();
foreach(T val in Values)
    result ^= val.GetHashCode();
return result;
Grzenio
pretty reasonable - note that xor can lead to a lot of collisions; *generally* a multiplier/addition is preferred
Marc Gravell
interesting, many people told me to use xor instead. I should read more about it then.
Grzenio
In response to that; what would the hash of {3,3,3,3} be? and {4,4,4,4}? or {4,0,0,4}? or {1,0,1,0}? You see the issue...
Marc Gravell
A: 

Provided that Id and Values will never change, and Values is not null...

public override int GetHashCode() { return Id ^ Values.GetHashCode(); }

Note that your class is not immutable, since anyone can modify the contents of Values because it is an array. Given that, I wouldn't try to generate a hashcode using its contents.

Dustin Campbell
That will do a reference comparison on Values, and will not be compatible with SequenceEqual (i.e. for different arrays with the same contents)
Marc Gravell
Right, but because the array is exposed and any external code can change it, it is frankly dangerous to compare the contents.
Dustin Campbell
So I should really just use the HashCode of the Id?
Svish
Which means that... IF the result of Equals changes, the result of GetHashCode wouldn't necessarily have to change, but if GetHashCode changes, then Equals would also change?
Svish
Not necessarily. The reference to Values shouldn't change (unless you change it in your code) -- so it should be OK to use that. John Saunders has the best answer here.
Dustin Campbell
@Dustin: "John Saunders has the best answer here" - no, posting a reply where the GetHashCode() is incompatible with Equals() is **not** good. It is actively a bad thing, and could lead to a *lot* of problems.
Marc Gravell
A: 

How about something like:

    public override int GetHashCode()
    {
        int hash = Id;
        if (Values != null)
        {
            hash = (hash * 17) + Values.Length;
            foreach (T t in Values)
            {
                hash *= 17;
                if (t != null) hash = hash + t.GetHashCode();
            }
        }
        return hash;
    }

This should be compatible with SequenceEqual, rather than doing a reference comparison on the array.

Marc Gravell
It is dangerous to compare the contents of Values because they aren't guaranteed to be the same for the lifetime of the object. Because the array is exposed, any external class can change it, which affects the hashcode!
Dustin Campbell
The point, though, is that it is compatible with the Equals method as posted.
Marc Gravell
It affects the equality as well. And you can't use the reference to the arary to compute hashcode, because you end up with two equal objects with different hash codes.
Grzenio
@Grzenio - is that aimed at me or Dustin? I don't use the reference for exactly that reason...
Marc Gravell
Sorry for the confusion, it was a reply to Dustin's comment here and his code at the same time.
Grzenio
+1  A: 

FWIW, it's very dangerous to use the contents of the Values in your hash code. You should only do this if you can guarantee that it will never change. However, since it is exposed, I don't that guarantee is possible. The hashcode of an object should never change. Otherwise, it loses it's value as a key in a Hashtable or Dictionary. Consider the hard-to-find bug of using an object as a key in a Hashtable, it's hashcode changes because of an outside influence and you can no longer find it in the Hashtable!

Dustin Campbell
A: 

Since the hashCode is kinda a key for storing the object (lllike in a hashtable), i would use just Id.GetHashCode()

Jhonny D. Cano -Leftware-
+10  A: 

Because of the problems raised in this thread, I'm posting another reply showing what happens if you get it wrong... mainly, that you can't use the array's GetHashCode(); the correct behaviour is that no warnings are printed when you run it... switch the comments to fix it:

using System;
using System.Collections.Generic;
using System.Linq;
static class Program
{
    static void Main()
    {
        // first and second are logically equivalent
        SimpleTableRow<int> first = new SimpleTableRow<int>(1, 2, 3, 4, 5, 6),
            second = new SimpleTableRow<int>(1, 2, 3, 4, 5, 6);

        if (first.Equals(second) && first.GetHashCode() != second.GetHashCode())
        { // proven Equals, but GetHashCode() disagrees
            Console.WriteLine("We have a problem");
        }
        HashSet<SimpleTableRow<int>> set = new HashSet<SimpleTableRow<int>>();
        set.Add(first);
        set.Add(second);
        // which confuses anything that uses hash algorithms
        if (set.Count != 1) Console.WriteLine("Yup, very bad indeed");
    }
}
class SimpleTableRow<T> : IEquatable<SimpleTableRow<T>>
{

    public SimpleTableRow(int id, params T[] values) {
        this.Id = id;
        this.Values = values;
    }
    public int Id { get; private set; }
    public T[] Values { get; private set; }

    public override int GetHashCode() // wrong
    {
        return Id.GetHashCode() ^ Values.GetHashCode();
    }
    /*
    public override int GetHashCode() // right
    {
        int hash = Id;
        if (Values != null)
        {
            hash = (hash * 17) + Values.Length;
            foreach (T t in Values)
            {
                hash *= 17;
                if (t != null) hash = hash + t.GetHashCode();
            }
        }
        return hash;
    }
    */
    public override bool Equals(object obj)
    {
        return Equals(obj as SimpleTableRow<T>);
    }
    public bool Equals(SimpleTableRow<T> other)
    {
        // Check for null
        if (ReferenceEquals(other, null))
            return false;

        // Check for same reference
        if (ReferenceEquals(this, other))
            return true;

        // Check for same Id and same Values
        return Id == other.Id && Values.SequenceEqual(other.Values);
    }
}
Marc Gravell
Could you explain the reasoning behind the right version of GetHashCode()?
Vinko Vrsalovic
@Vinko: Can you clarify? Do you mean "why does the hash-code matter?" - or "why that approach?". Given your rep and answer count, I'm assuming the latter; this is simply a way of getting a hash that takes all the values into account the "multiply by a prime and add the next hash" is a very common approach for hashing that avoids collisions (contrast xor; in which case a collection of "all 8s" could easily give a predictable hashcode of 0). Did I miss anything?
Marc Gravell
See also: http://stackoverflow.com/questions/263400#263416... different prime number, but same effect.
Marc Gravell
Yes, that was the question. Thanks.
Vinko Vrsalovic
Can you help me out with this related question please: http://stackoverflow.com/questions/2626547/problem-with-custom-equality-in-entity-framework
Shimmy
I started it anew, sorry. http://stackoverflow.com/questions/2626839/problem-with-custom-equality-and-gethashcode-in-a-mutable-object, anyway, my aim is the equality, I wish I could skip the GetHashCode implementation part. And yes, the initial value is 0. Anyway I use EF so all the objects are initialized with ID as 0 and then the properties are individually set one by one, not by the initializer, that's the reason that if it's hashed when the ID isn't loaded yet it goes nuts, maybe you'd know how to solve it and enjoy both proper hashing along with equality on this mutable object.
Shimmy