views:

535

answers:

5

I want to create a LinkedHashMap which will limit its size based on available memory (ie. when freeMemory + (maxMemory - allocatedMemory) gets below a certain threshold). This will be used as a form of cache, probably using "least recently used" as a caching strategy.

My concern though is that allocatedMemory also includes (I assume) un-garbage collected data, and thus will over-estimate the amount of used memory. I'm concerned about the unintended consequences this might have.

For example, the LinkedHashMap may keep deleting items because it thinks there isn't enough free memory, but the free memory doesn't increase because these deleted items aren't being garbage collected immediately.

Does anyone have any experience with this type of thing? Is my concern warranted? If so, can anyone suggest a good approach?

I should add that I also want to be able to "lock" the cache, basically saying "ok, from now on don't delete anything because of memory usage issues".

A: 

Caches tend to be problematic. IIRC, there's a SoftCache in Sun's JRE that has had many problems.

Anyway, the simplest thing is to use SoftReferences in the map. This should work fine so long as the overhead of SoftReference plus Map.Entry is significantly lower than the cached data.

Alternatively you can, like WeakHashMap, use a ReferenceQueue and either poll it or have a thread blocking on it (one thread per instance, unfortunately). Be careful with synchronisation issues.

"Locking" the map, you probably want to avoid if necessary. You'd need keep strong references to all the data (and evict if not null). That is going to be ugly.

Tom Hawtin - tackline
For locking, I was thinking I could simply copy the whole thing into a normal HashMap
sanity
Well, you can copy the contents of the map, but then you have to dereference the soft references as well. Although copying instead of locking is a better way to go.
Tom Hawtin - tackline
A: 

I would strongly suggest using something like Ehcache instead of re-inventing a caching system. It's super simple to use, very configurable, and works great.

matt b
A: 

As matt b said, something like Ehcache or JbossCache is a good first step.

If you want something light-weight and in-process, look at google collections. For example, you can use MapMaker (http://google-collections.googlecode.com/svn/trunk/javadoc/index.html?com/google/common/collect/BiMap.html) to make a map with Soft/Weak keys and values, so it would cache only those items it has room for (though you wouldn't get LRU).

nojo
Oops - better link: http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/MapMaker.html
nojo
+1  A: 

I know I'm biased, but I really have to strongly recommend our MapMaker for this. Use the softKeys() or softValues() feature, depending on whether it's GC collection of the key or of the value that more aptly describes when an entry can be cleaned up.

Kevin Bourrillion
I would like to use MapMaker (I make extensive use of Google Collections in my code), but is there any way to make it behave like an LRU cache which scales according to available memory? Additionally, is there any way to "lock" it so that it will no-longer delete entries in response to garbage collection?
sanity
"behave like an LRU cache..." -- that pretty much describes exactly what soft references will give you, yes."lock" it -- the only sensible answer to this involves copying the map: Map<Key, Val> strongCache = Maps.newHashMap(cache);
Kevin Bourrillion
sorry about the < garbage; there's no preview so I had to guess whether I would have to escape or not. :(
Kevin Bourrillion
The danger of using soft/weak cache entries is that your cache algorithm suddenly becomes very dependent on (unpredictable) GC. Ideally what one would want is to have "hard" LRU logic for put/get/delete and soft/weak logic kicking in when cache memory goes over designated limit. Sadly, I can't imagine how this could be done. The only closest solution I can think of is a separate thread which does instrumentation and entry eviction, but that's just one massive hack.
mindas
A: 

I had the same need in the past and this is how I implemented my cache:

  • there is a cache memory manager, which has a minimum and a maximum memory limit(max limit it matters anyway)
  • every registered cache has the following(important) parameters: maximum capacity(most of the time you have a higher limit, you don't want go hold more than X items) & percent memory usage
  • I use a LinkedHashMap and a ReentrantReadWriteLock to guard the cache.
  • every X puts I calculate the mean memory consumption per entry and trigger eviction(async) if the calculated memory limit > allowed memory limit.
  • of course memory computation doesn't actually shows the real memory consumption but comparing the computed memory with the real value(using a profiler) I find that it is close enough.

I was planning to put also an additional guard on the cache, to evict in case puts are going faster than memory based evictions, but until now I didn't find the need to do it.

adrian.tarau