views:

325

answers:

3

Hi,

I've used an array as hash table for hashing alogrithm with values:

int[] arr={4 , 5 , 64 ,432 };

and keys with consective integers in array as:

int keys[]={ 1 , 2 , 3 ,4};

Could anyone please tell me, what would be the good approach in mapping those integers keys with those arrays location? Is the following a short and better approach with little or no collision (or something larger values)?

 keys[i] % arrlength  // where i is for different element of an array

Thanks in advance.

+1  A: 

Any reason not to use the built-in HashMap ? You will have to use Integer though, not int.

 java.util.Map myMap = new java.util.HashMap<Integer, Integer>();

Since you want to implement your own, then first brush-up on hash tables by reading the Wikipedia article. After that, you could study the HashMap source code.

This StackOverflow question contains interesting links for implementing fast hashmaps (for C++ though), as does this one (for Java).

JRL
I know about that built-in class, but I try to build my own class for learning and for practical purpose.
Stardust
+2  A: 

I assume you're trying to implement some kind of hash table as an exercise. Otherwise, you should just use a java.util.HashMap or java.util.HashTree or similar.

For a small set of values, as you have given above, your solution is fine. The real question will come when your data grows much bigger.

You have identified that collisions are undesirable - that is true. Sometimes, some knowledge of the likely keys can help you design a good hash function. Sometimes, you can assume that the key class will have a good hash() method. Since hash() is a method defined by Object, every class implements it. It would be neatest for you to be able to utilise the hash() method of your key, rather than have to build a new algorithm specially for your map.

If all integer keys are equally likely, then a mod function will spread them out evenly amongst the different buckets, minimising collisions. However, if you know that the keys are going to be numbered consecutively, it might be better to use a List than a HashMap - this will guarantee no collisions.

John
+1  A: 

Get yourself an book about algorithms and data structures and read the chapter about hash tables (The Wikipedia article would also be a good entry point). It's a complex topic and far beyond the scope of a Q&A site like this.

For starters, using the array-size modulo is in general a horrible hash function, because it results in massive collisions when the values are multiples of the array size or one of its divisors. How bad that is depends on the array size: the more divisors it has, the more likely are collisions; when it's a prime number, it's not too bad (but not really good either).

Michael Borgwardt