What's the most efficient way to build a cache with arbitrary Ruby objects as keys that are expired based on a least recently used algorithm. It should use Ruby's normal hashing semantics (not equal?)
This pushes the boundaries of my understanding of how Ruby uses memory, but I suspect that the most efficient implementation would be a doubly-linked list where every access moves the key to the front of the list, and every insert drops an item if the maximum size has been reached.
However, assuming Ruby's Hash
class is already very efficient, I'd bet that the somewhat naive solution of simply adding age data to a Hash
would be fairly good. Here's a quick toy example that does this:
class Cache
attr_accessor :max_size
def initialize(max_size = 4)
@data = {}
@max_size = max_size
end
def store(key, value)
@data.store key, [0, value]
age_keys
prune
end
def read(key)
if value = @data[key]
renew(key)
age_keys
end
value
end
private # -------------------------------
def renew(key)
@data[key][0] = 0
end
def delete_oldest
m = @data.values.map{ |v| v[0] }.max
@data.reject!{ |k,v| v[0] == m }
end
def age_keys
@data.each{ |k,v| @data[k][0] += 1 }
end
def prune
delete_oldest if @data.size > @max_size
end
end
There's probably a faster way of finding the oldest item, and this is not thoroughly tested, but I'd be curious to know how anyone thinks this compares to a more sophisticated design, linked list or otherwise.
Remaze has a reasonably well tested LRU Cache: See http://github.com/manveru/ramaze/blob/master/lib/ramaze/snippets/ramaze/lru%5Fhash.rb
And there is also this: http://github.com/rubyworks/lrucache/blob/master/lib/lrucache.rb which should be more efficient than the remaze one for large caches.