Why is it necessary for a hash table's (the data structure) size to be a prime?
From what I understand, it assures a more even distribution but is there any other reason?
Why is it necessary for a hash table's (the data structure) size to be a prime?
From what I understand, it assures a more even distribution but is there any other reason?
from http://srinvis.blogspot.com/2006/07/hash-table-lengths-and-prime-numbers.html
If suppose your hashCode function results in the following hashCodes among others {x , 2x, 3x, 4x, 5x, 6x...}, then all these are going to be clustered in just m number of buckets, where m = table_length/GreatestCommonFactor(table_length, x). (It is trivial to verify/derive this). Now you can do one of the following to avoid clustering
Make sure that you don't generate too many hashCodes that are multiples of another hashCode like in {x, 2x, 3x, 4x, 5x, 6x...}.But this may be kind of difficult if your hashTable is supposed to have millions of entries.
Or simply make m equal to the table_length by making GreatestCommonFactor(table_length, x) equal to 1, i.e by making table_length coprime with x. And if x can be just about any number then make sure that table_length is a prime number.
Whatever hashfunction you use you get an integer. In order to map that to the hashtable usually you'd mod
the integer with the size of the hashtable to make that value smaller than the size of the table in order to map it.
return hashVal % tableSize
I'm a bit lost from this point onwards but IIRC if tableSize is even, all entries will be even. Half your hashtable will never be populated.