I need to store a big hash set, able to contain up to approx 200 millions 40 bit values. Storing it as 200 millions 64 bit value would be acceptable (despite the 200 millions * 16 bits loss).
The requirements are:
tiny memory footprint (disk space ain't an issue, memory is)
fast
contains(long l)
andadd(long l)
methods (much faster than SQL)embedded
free and without nasty licensing (no Berkeley DB). LGPL fine.
no false positive and no false negative, so things like disk-based Bloom Filters are not what I'm after
SQL is not what I'm after here .
Because I really think I'm more after something fast like this (notice how the solution is much faster than a SQL solution):
http://stackoverflow.com/questions/495161/fast-disk-based-hashtables
Does Google have such a Java API?
Would a fast disk-based key/value pair implementation where I'd only use the 'key' work?
Or something else?
I'd rather not reinvent the weel.