For what it's worth, Effective Java 2nd Edition hand-waives around the mathematics issue and just say that the reason to choose 31 is:
- Because it's an odd prime, and it's "traditional" to use primes
- It's also one less than a power of two, which permits for bitwise optimization
Here's the full quote, from Item 9: Always override hashCode
when you override equals
:
The value 31 was chosen because it's an odd prime. If it were even and multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional.
A nice property of 31 is that the multiplication can be replaced by a shift (§15.19) and subtraction for better performance:
31 * i == (i << 5) - i
Modern VMs do this sort of optimization automatically.
While the recipe in this item yields reasonably good hash functions, it does not yield state-of-the-art hash functions, nor do Java platform libraries provide such hash functions as of release 1.6. Writing such hash functions is a research topic, best left to mathematicians and theoretical computer scientists.
Perhaps a later release of the platform will provide state-of-the-art hash functions for its classes and utility methods to allow average programmers to construct such hash functions. In the meantime, the techniques described in this item should be adequate for most applications.
Rather simplistically, it can be said that using a multiplier with numerous divisors will result in more hash collisions. Since for effective hashing we want to minimize the number of collisions, we try to use a multiplier that has fewer divisors. A prime number by definition has exactly two distinct, positive divisors.
Related questions