views:

110

answers:

3

I'm implementing something like a cache, which works like this:

  1. If a new value for the given key arrives from some external process, store that value, and remember the time when this value arrived.
  2. If we are idle, find the oldest entry in the cache, fetch the new value for the key from external source and update the cache.
  3. Return the value for the given key when asked.

I need a data structure to store key-value pairs which would allow to perform the following operations as fast as possible (in the order of speed priority):

  1. Find the key with the lowest (unknown) value.
  2. Update a value for the given key or add a new key-value pair if the key does not exist.
  3. Other regular hash-table operations, like delete a key, check if a key exists, etc.

Are there any data-structures which allow this? The problem here is that to perform the first query quickly I need something value-ordered and to update the values for the given key quickly I need something key-ordered. The best solution I have so far is something like this:

Store values an a regular hashtable, and pairs of (value, key) as a value-ordered heap. Finding the key for the lowest value goes like this:

  1. Find the key for the lowest value on the heap.
  2. Find the value for that key from the hashtable.
  3. If the values don't match pop the value from the heap and repeat from step 1.

Updating the values goes like this:

  1. Store the value in the hashtable.
  2. Push the new (value, key) pair to the heap.

Deleting a key is more tricky and requires searching for the value in the heap. This gives something like O(log n) performance, but this solution seems to be cumbersome to me.

Are there any data structures which combine the properties of a hashtable for keys and a heap for the associated values? I'm programming in Python, so if there are existing implementations in Python, it is a big plus.

A: 

You're looking for a Map, or an associative array. To get more specific, we'd need to know what language you're trying to use.

Brian S
How would a map give an answer to the first question really fast?
abbot
+1  A: 

Most heap implementations will get you the lowest key in your collection in O(1) time, but there's no guarantees regarding the speed of random lookups or removal. I'd recommend pairing up two data structures: any simple heap implementation and any out-of-the-box hashtable.

Of course, any balanced binary tree can be used as a heap, since the smallest and largest values are on the left-most and right-most leaves respectively. Red-black tree or AVL tree should give you O(lg n) heap and dictionary operations.

Juliet
I don't understand how a paired data structure would help. Consider I'm updating a value for some known key. This operation is fast for the hashtable, but the heap is value-ordered, not key-ordered, so the lookup will be something like O(n). I can't understand how a tree would help either. The problem here is that I need something value-ordered for answers to question 1 and something key-ordered at the same time for updates.
abbot
A: 

I'd try:

import heapq

myheap = []
mydict = {}

...

def push(key, val):
    heapq.heappush(myheap, (val, key))
    mydict[key] = val

def pop():
    ...

More info here

pillmuncher