views:

611

answers:

6

I am looking for a data structure that operates similar to a hash table, but where the table has a size limit. When the number of items in the hash reaches the size limit, a culling function should be called to get rid of the least-retrieved key/value pairs in the table.

Here's some pseudocode of what I'm working on:

class MyClass {
  private Map<Integer, Integer> cache = new HashMap<Integer, Integer>();
  public int myFunc(int n) {
    if(cache.containsKey(n))
      return cache.get(n);
    int next = . . . ; //some complicated math.  guaranteed next != n.
    int ret = 1 + myFunc(next);
    cache.put(n, ret);
    return ret;
  }
}

What happens is that there are some values of n for which myFunc() will be called lots of times, but many other values of n which will only be computed once. So the cache could fill up with millions of values that are never needed again. I'd like to have a way for the cache to automatically remove elements that are not frequently retrieved.

This feels like a problem that must be solved already, but I'm not sure what the data structure is that I would use to do it efficiently. Can anyone point me in the right direction?


Update I knew this had to be an already-solved problem. It's called an LRU Cache and is easy to make by extending the LinkedHashMap class. Here is the code that incorporates the solution:

class MyClass {
  private final static int SIZE_LIMIT = 1000;
  private Map<Integer, Integer> cache =
    new LinkedHashMap<Integer, Integer>(16, 0.75f, true) {
      protected boolean removeEldestEntry(Map.Entry<Integer, Integer> eldest)
      {
        return size() > SIZE_LIMIT;
      }
  };
  public int myFunc(int n) {
    if(cache.containsKey(n))
      return cache.get(n);
    int next = . . . ; //some complicated math.  guaranteed next != n.
    int ret = 1 + myFunc(next);
    cache.put(n, ret);
    return ret;
  }
}
A: 

Take a look at WeakHashMap

Dev er dev
That's not exactly doing what I want. From what I understand, a weak hash map just means that the presence of a key in the map does not count as a reference to the object, so the garbage collector can still remove it. Since I'm using Integers, there's no way to be sure this does what I want.
Kip
+1  A: 

WeakHashMap will probably not do what you expect it to... read the documentation carefully and ensure that you know exactly what you from weak and strong references.

I would recommend you have a look at java.util.LinkedHashMap and use its removeEldestEntry method to maintain your cache. If your math is very resource intensive, you might want to move entries to the front whenever they are used to ensure that only unused entries fall to the end of the set.

kasperjj
+3  A: 

Googling "LRU map" and "I'm feeling lucky" gives you this:

http://commons.apache.org/collections/apidocs/org/apache/commons/collections/map/LRUMap.html

A Map implementation with a fixed maximum size which removes the least recently used entry if an entry is added when full.

Sounds pretty much spot on :)

Paul
Thanks, it was one of those cases where it's really easy to find the answer if you already know the answer is "LRU map", but if you already knew that you wouldn't need to find it. :)
Kip
Yep, sorry, wasn't trying to be snarky. I hit "I feel lucky" by accident, too :)
Paul
+14  A: 

You are looking for an LRUList/Map. Check out LinkedHashMap:

The removeEldestEntry(Map.Entry) method may be overridden to impose a policy for removing stale mappings automatically when new mappings are added to the map.

ReneS
You beat me to it...
Michael Myers
Thanks, this was precisely what I wanted!
Kip
You are welcome and it is a JDK class, so no external dependencies.
ReneS
A: 

You probably want to implement a Least-Recently Used policy for your map. There's a simple way to do it on top of a LinkedHashMap:

http://www.roseindia.net/java/example/java/util/LRUCacheExample.shtml

albertb
+1  A: 

The Adaptive Replacement Cache policy is designed to keep one-time requests from polluting your cache. This may be fancier than you're looking for, but it does directly address your "filling up with values that are never needed again".

Darius Bacon