I am currently working on a programming related problem where I am attempted to make a massive hashmap of data. The key for the data is a custom low-memory implementation of a CharSequence that implements hashCode() and equals(...) and the value is am Integer object.
There may be millions of entries in this hashtable and I managed to drastically reduce memory use for the value by having the Integer be a pointer in a file to the data I wish to hash but thbe problem is that the key may be tens of bytes (on average 25 bytes) and that the keys need to be held in memory in the default implementation of HashMap.
I need a hashmap that has a low memory overhead and that can possibly page the keys to disk or alternatively store a hashed representation of the keys. If the keys are themselves hashed then I would be concerned about hash collisions.
Ideally, I would like to be able to store a million entries in the map per 50MB of heap space (one byte array of 25 bytes in the key and Integer object in the value part).
Does anyone have any experience with low-memory filesystem-backed Maps that are optimised for reducing the footprint of the keys?
Thanks,
Chris