views:

191

answers:

2

How mouch Scala standard library can be reused to create variant of HashMap that does not handle collisions at all?

In HashMap implementation in Scala I can see that traits HashEntry, DefaultEntry and LinkedEntry are related, but I'm not sure whether I have any control over them.

+1  A: 

You could do this by extending HashMap (read the source code of HashMap to see what needs to be modified); basically you'd override put and += to not call findEntry, and you'd override addEntry (from HashTable) to simply compute the hash code and drop the entry in place. Then it wouldn't handle collsions at all.

But this isn't a wise thing to do since the HashEntry structure is specifically designed to handle collisions--the next pointer becomes entirely superfluous at that point. So if you are doing this for performance reasons, it's a bad choice; you have overhead because you wrap everything in an Entry. If you want no collision checking, you're better off just storing the (key,value) tuples in a flat array, or using separate key and value arrays.

Keep in mind that you will now suffer from collisions in hash value, not just in key. And, normally, HashMap starts small and then expands, so you will initially destructively collide things which would have survived had it not started small. You could override initialSize also if you knew how much you would add so that you'd never need to resize.

But, basically, if you want to write a special-purpose high-speed unsafe hash map, you're better off writing it from scratch or using some other library. If you modify the generic library version, you'll get all the unsafety without all of the speed. If it's worth fiddling with, it's worth redoing entirely. (For example, you should implement filters and such that map f: (Key,Value) => Boolean instead of mapping the (K,V) tuple--that way you don't have to wrap and unwrap tuples.)

Rex Kerr
Very good answers, thank you. I want to reimplement from scratch. But I don't want to implement dozens of methods. I want to reuse some methods implementations (like foldLeft, exists, find, etc) that base on some primitive methods. What is the minimal set of primitive methods I have to implement and what traits do I have to use?
Łukasz Lew
@Łukasz: I think your best bet is to look at the source code for `HashMap` and `HashTable` and play with it until you understand how it works. Extending the 2.8 collections in a nice way takes a reasonable amount of work. As far as I know, the code itself is the only documentation of the tree of dependencies; `Iterable` requires very little, but subclasses override various parts of the tree in order make them more efficient and/or add functionality.
Rex Kerr
A: 

I guess it depends what you mean by "does not handle collisions at all". Would a thin layer over a MultiMap be sufficient for your needs?

pdbartlett