Take a look into Essential C#.
It contains a detailed description on how to overwrite GetHashCode()
correctly.
Extract from the book
The purpose of the hash code is to efficiently balance a hash table by generating a number that corresponds to the value of an object.
- Required: Equal objects must have equal hash codes (if
a.Equals(b)
, then a.GetHashCode() == b.GetHashCode()
)
- Required:
GetHashCode()
's returns over the life of a particular object should be constant (the same value), even if the object's data changes. In many cases, you should cache the method return to enforce this.
- Required:
GetHashCode()
should not throw any exceptions; GetHashCode()
must always successfully return a value.
- Performance: Hash codes should be unique whenever possible. However, since hash code return only an
int
, there has to be an overlap in hash codes for objects that have potentially more values than an int can hold -- virtually all types. (An obvious example is long
, since there are more possible long
values than an int
could uniquely identify.)
- Performance: The possible hash code values should be distributed evenly over the range of an
int
. For example, creating a hash that doesn't consider the fact that distribution of a string in Latin-based languages primarily centers on the initial 128 ASCII characters would result in a very uneven distribution of string values and would not be a strong GetHashCode()
algorithm.
- Performance:
GetHashCode()
should be optimized for performance. GetHashCode()
is generally used in Equals()
implementations to short-circuit a full equals comparison if the hash codes are different. As a result, it is frequently called when the type is used as a key type in dictionary collections.
- Performance: Small differences between two objects should result in large differences between hash codes values -- ideally, a 1-bit difference in the object results in around 16 bits of the hash code changing, on average. This helps ensure that the hash table remains balanced no matter how it is "bucketing" the hash values.
- Security: It should be difficult for an attacker to craft an object that has a particular hash code. The attack is to flood a hash table with large amounts of data that all hash to the same value. The hash table implementation then becomes O(n) instead of O(1), resulting in a possible denial-of-service attack.
As already mentioned here you have also to consider some points about overriding Equals()
and there are some code examples showing how to implement these two functions.
So these informations should give a starting point but i recommend to buy the book and to read the complete chapter 9 (at least the first twelve sides) to get all the points on how to correctly implement these two crucial functions.