views:

118

answers:

4
+2  Q: 

Disk based HashMap

Does Java have (or is there a library available) that allows me to have a disk based HashMap? It doesn't need to be atomic or anything, but it will be accessed via multiple threads and shouldn't crash if two are accessing the same element at the same time.

Anyone know of anything?

+6  A: 

Sounds like you need something close to a lightweight db. Have you looked at/considered Java DB? A light db with a single, indexed table would basically be a disk-based, thread-safe hash map.

Paul Sasik
I looked into exactly this at one point--actually going from the mindset of a hash map to a disk based hash map to a DB which is essentially a disk based hash map. There were a few really good one that you can just include the jar and use it, and that was 5+ years ago. There is a good discussion of databases here: http://www.linkedin.com/answers/technology/software-development/TCH_SFT/1207-3692603
Bill K
I'm actually already use sqlite for this problem, but really all I need is just a threadsafe String key/value store.
synic
A: 

Isn't NOSQL supposed to be just some kind of HashMap? Propably that's not what you want because there is still overhead of a database and not just calls to a map. I havn't tried any of them with Java but some API might look quite simple, almost feel like map?

Peter Kofler
+4  A: 

Either properties files or Berkeley DB might be what you're looking for. The java.util.Properties itself implements java.util.Map and provides methods to load from and store to a file. The Berkeley DB is often been recommended as a lightweight key-value pair datastore.

BalusC
I think Properties will be perfect. I don't know why I thought I had to complicate this, hah.
synic
Properties is not threadsafe which you said you needed. Just imagine two threads writing the file to disk at the same time.You could program around that or take a look at http://ehcache.org/ which would also provide much better performance if needed.
zockman
A: 

Project Voldemort is also a really fast/scalable/replication "Hashmap". It is used at LinkedIn an performance is also pretty good:

A quote from their site:

Here is the throughput we see from a single multithreaded client talking to a single server where the "hot" data set is in memory under artificially heavy load in our performance lab:

Reads: 19,384 req/sec
Writes: 16,559 req/sec

Alfred