tags:

views:

181

answers:

5

I'm creating a key for a dictionary which is a structure of two strings. When I test this method in a console app, it works, but I'm not sure if the only reason it works is because the strings are being interned and therefore have the same references.

Foo foo1 = new Foo();
Foo foo2 = new Foo();
foo1.Key1 = "abc";
foo2.Key1 = "abc";
foo1.Key2 = "def";
foo2.Key2 = "def";

Dictionary<Foo, string> bar = new Dictionary<Foo, string>();
bar.Add(foo1, "found");

if(bar.ContainsKey(foo2))
    System.Console.WriteLine("This works.");
else
    System.Console.WriteLine("Does not work");

The struct is simply:

public struct Foo
{
    public string Key1;
    public string Key2;
}

Are there any cases which would cause this to fail or am I good to rely on this as a unique key?

+1  A: 

According to the MSDN,

The default implementation of the GetHashCode method does not guarantee unique return values for different objects. […]

The GetHashCode method can be overridden by a derived type. Value types must override this method to provide a hash function that is appropriate for that type and to provide a useful distribution in a hash table.

(emphasis mine.)

So your code is not guaranteed to work, and if it works, it is not guaranteed to continue working in the next version of the .NET framework.

For what it’s worth, Mono’s current implementation of ValueType.GetHashCode calls the type’s members’ GetHashCode methods so the code will actually work correctly independently of string interning.

[excuse the previous answer, it was just plain wrong.]

Konrad Rudolph
+3  A: 

http://msdn.microsoft.com/en-us/library/dd183755.aspx

Any struct that you define already has a default implementation of value equality that it inherits from the System..::.ValueType override of the Object..::.Equals(Object) method. This implementation uses reflection to examine all the public and non-public fields and properties in the type. Although this implementation produces correct results, it is relatively slow compared to a custom implementation that you write specifically for the type.

So if the properties are equal, they are equal. You might want to override Equals though.

statichippo
But since `GetHashCode` gives no such guarantee I’m not sure that we can conclude that it will work. In fact, the MSDN explicitly states that it is *not* guaranteed to work.
Konrad Rudolph
we can because as ruibm commented on his own answer, we're dealing with strings here. however, you're right that he should probably override Equals and GetHashCode because the default implementation uses reflection.
statichippo
Wow, didn't know that. Thanks ^^
SealedSun
+4  A: 

According to Microsoft's documentation you should always override GetHashCode() if you intend to use your own data structures as keys in HashTables otherwise you might not be safe.

"Objects used as a key in a Hashtable object must also override the GetHashCode method because those objects must generate their own hash code."

http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx

ruibm
Another quote from the GetHashCode documentation: The default implementation of GetHashCode does not guarantee uniqueness or consistency; therefore, it must not be used as a unique object identifier for hashing purposes.
Alex B
Yeah, true for most types. For strings it says it's correctly implemented so you can rely on that.
ruibm
It seems that the strings do work as-is but because there's no explicit guarantee for it to *always* be the case I'm overriding GetHashCode and Equals. Thanks all for the tips!
Nick Gotch
And don't forget: mutable structs are considered evil in this day and age. Make those fields readonly and initialize them via the struct's constructor!
Jesse C. Slicer
+4  A: 

As per my comment to rubim's answer above, here's an reference to how I think the struct itself should be implemented. Note the immutable fields that can only be initialized via the constructor.

public struct Foo
{
    private readonly string key1;

    private readonly string key2;

    public string Key1
    {
        get
        {
            return this.key1;
        }
    }

    public string Key2
    {
        get
        {
            return this.key2;
        }
    }

    public Foo(string key1, string key2)
    {
        this.key1 = key1;
        this.key2 = key2;
    }

    public static bool operator ==(Foo foo1, Foo foo2)
    {
        return foo1.Equals(foo2);
    }

    public static bool operator !=(Foo foo1, Foo foo2)
    {
        return !(foo1 == foo2);
    }

    public override bool Equals(object obj)
    {
        if (!(obj is Foo))
        {
            return false;
        }

        Foo foo = (Foo)obj;
        bool key1Equal = ((this.key1 == null) && (foo.Key1 == null))
            || ((this.key1 != null) && this.key1.Equals(foo.Key1));
        bool key2Equal = ((this.key2 == null) && (foo.Key2 == null))
            || ((this.key2 != null) && this.key2.Equals(foo.Key2));

        return key1Equal && key2Equal;
    }

    public override int GetHashCode()
    {
        unchecked
        {
            int hash = 17;

            hash = (23 * hash)
                + (this.key1 == null ? 0 : this.key1.GetHashCode());
            return (31 * hash)
                + (this.key2 == null ? 0 : this.key2.GetHashCode());
        }
    }

    public override string ToString()
    {
        return (this.key1 == null ? string.Empty : this.key1.ToString() + ",")
            + (this.key2 == null ? string.Empty : this.key2.ToString());
    }
}

Then, the way to use these would be as such:

    Foo foo1 = new Foo("abc", "def");
    Foo foo2 = new Foo("abc", "def");

    Dictionary<Foo, string> bar = new Dictionary<Foo, string>();
    bar.Add(foo1, "found");

    if (bar.ContainsKey(foo2))
    {
        Console.WriteLine("This works.");
    }
    else
    {
        Console.WriteLine("Does not work");
    }
Jesse C. Slicer
This is correct and very complete.
Steven Sudit
@Steven - Alas, it wasn't as complete as it should have been. The important addition was nullity checks in the overridden `Equals` method.
Jesse C. Slicer
My fault for not catching that, and credit to you for doing so.
Steven Sudit
A: 

Your code should work without any problem in .NET, and I would dare to say that it will remain so even if .NET version changes. The thing is that dictionary uses two things to access the value by a specific key: Equals() and GetHashCode() in the following manner

  • get all keys that match requested key value by GetHashCode() value (GetHashCode value must be equal for equal objects, and should be different for different objects)
  • filter out the list that was created in the previous step using Equals().

Standard Equals() for the value types uses reflection to access fields and compares them (as previously mentioned). Taking this into account default GetHashCode() must be consistent with how the default Equals() implementation works - if a == b, a.GetHashCode() == b.GetHashCode() - either way there would be really no point in providing default implementation if it wouldn't even meet the required minimum. What MSDN says is that GetHashCode() doesn't provide unique values and that is fully understandable. Bear in mind that such simple implementation:

public int GetHashCode()
{
   return 0;
}

is still correct (dictionaries, hashtables etc would work) even though values are not unique. Performance of course is a different matter, using such implementation would result in full scan for each collection and testing each element for equality.