views:

407

answers:

3

How to go about creating a Hashmap in C from scratch ? What would be the parameters taken into consideration and what how would you test the hashmap as to how good it is ? As in what would be benchmark test cases which you require to run before you say that your hash map is complete.

+1  A: 

The best approach depends on the expected key distribution and number of collisions. If relatively few collisions are expected, it really doesn't matter which method is used. If lots of collisions are expected, then which to use depends on the cost of rehashing or probing vs. manipulating the extensible bucket data structure.

But here is source code example of An Hashmap Implementation in C

TStamper
As the later post says that we need to handle collision also. Moreover the hash implementation has a table_size which is like fixed. If we want to dynamically increase the size of the hashmap, without the programmer knowing how its done. Could you suggest something ?
Thunderboltz
Resizing the key space means changing the hash function or at leastthe function's parameters and rehashing all the entries. Each differently sized map requires a different set of hash functions to maintain the desired key distribution.
TStamper
+5  A: 

Well if you know the basics behind them, it shouldn't be too hard.

Generally you create an array called "buckets" that contain the key and value, with an optional pointer to create a linked list.

When you access the hash table with a key, you process the key with a custom hash function which will return an integer. You then take the modulus of the result and that is the location of your array index or "bucket". Then you check the unhashed key with the stored key, and if it matches, then you found the right place.

Otherwise, you've had a "collision" and must crawl through the linked list and compare keys until you match. (note some implementations use a binary tree instead of linked list for collisions).

Check out this fast hash table implementation:

http://attractivechaos.awardspace.com/khash.h.html

Unknown
The link provided is amazing .. thnx
Thunderboltz
+1  A: 

There are other mechanisms to handle overflow than the simple minded linked list of overflow entries which e.g. wastes a lot of memory.

Which mechanism to use depends among other things on if you can choose the hash function and possible pick more than one (to implement e.g. double hashing to handle collisions); if you expect to often add items or if the map is static once filled; if you intend to remove items or not; ...

The best way to implement this is to first think about all these parameters and then not code it yourself but to pick a mature existing implementation. Google has a few good implementations -- e.g. http://code.google.com/p/google-sparsehash/

HD