Eclipse 3.5 has a very nice feature to generate Java hashCode() functions. It would generate for example (slightly shortened:)
class HashTest {
int i;
int j;
public int hashCode() {
final int prime = 31;
int result = prime + i;
result = prime * result + j;
return result;
}
}
(If you have more attributes in the class, result = prime * result + attribute.hashCode();
is repeated for each additional attribute. For ints .hashCode() can be omitted.)
This seems fine but for the choice 31 for the prime. It is probably taken from the hashCode implementation of Java String, which was used for performance reasons that are long gone after the introduction of hardware multipliers. Here you have many hashcode collisions for small values of i and j: for example (0,0) and (-1,31) have the same value. I think that is a Bad Thing(TM), since small values occur often. For String.hashCode you'll also find many short strings with the same hashcode, for instance "Ca" and "DB". If you take a large prime, this problem disappears if you choose the prime right.
So my question: what is a good prime to choose? What criteria do you apply to find it?
This is meant as a general question - so I do not want to give a range for i and j. But I suppose in most applications relatively small values occur more often than large values. (If you have large values the choice of the prime is probably unimportant.) It might not make much of a difference, but a better choice is an easy and obvious way to improve this - so why not do it? Commons lang HashCodeBuilder also suggests curiously small values.