ansaurus

Question

What is the best datastructure for a small and simple cache.

Answer 1

+1 A:

A very simple way to do this without using timestamps would be to have an ordered dictionary, where you have the MRU at the end (this is, when a request for the same object comes a second time, you delete it and add it up again at the end of the dict) so, when you need to expire, you just remove a slice of size X from the beginning of the ordered dict if the size is greater than the limit.

Efficiency would now depend on how that ordered dict is implemented.

Vinko Vrsalovic 2009-07-08 06:45:05

Answer 2

A:

I doubt there's a golden bullet for this; the optimal strategy depends strongly on the cost of cache misses and the temporal distribution of the parameters for your calculation.

Garbage collection methods might give you some inspiration. If you think of your cache as the heap, and cache hits as references, then you have the problem of efficiently collecting cached results with low (not zero) numbers of hits. The problem is far more forgiving than GC, because anything you nuke you can be recalculated.

A refinement to your method in this vein would be to introduce an additional cache for frequently-hit parameters. Add a counter to each cached value that is incremented on cache hits. When some threshold is passed, the cached value is promoted to the additional cache. Both generations of cache can be size clamped, so you'd still have a hard limit on memory use. It's an empirical question if the (possible) reduction in cache misses justifies the overhead (lookup in two caches, hit counters, copying, etc.)...

othercriteria 2009-07-08 07:10:53

ansaurus

tags:

views:

answers:

What is the best datastructure for a small and simple cache.

related questions