views:

137

answers:

2

I'm moving a Google App Engine web application outside "the cloud" to a standard web framework (webpy) and I would like to know how to implement a memcache feature available on Gae.

In my app I just use this cache to store a bunch of data retrieved from a remote api every X hours; in other words I don't stress this cache too much.

I've naively implemented something like this:

class TinyCache():
    class _Container():
        def __init__(self, value, seconds):
            self.value = value
            self.cache_age = datetime.now()
            self.cache_time = timedelta(seconds = seconds)
        def is_stale(self):
            return self.cache_age + self.cache_time < datetime.now() 

    def __init__(self):
        self.dict_cache={}

    def add(self, key, value, seconds = 7200):
        self.dict_cache[key] = self._Container(value, seconds)

    def get(self, key):
        if key in self.dict_cache:
            if self.dict_cache[key].is_stale():
                del self.dict_cache[key]
                return None
            else:
                return self.dict_cache[key].value
        else:
            return None

A typical usage would be:

data = tinycache.get("remote_api_data")
if data is not None:
    return data
else:
    data = self.api_call()
    tinycache.add("remote_api_data", data, 7200)
    return data

How could I improve it?
Do I need to make it Thread-Safe?

A: 

In my app I just use this cache to store a bunch of data retrieved from a remote api every X hours; in other words I don't stress this cache too much.

How could I improve it?

If your code works for you, why bother?

However as you explicitly asked for comments, I try to add my ideas anyway. To me it sounds like you could use a traditional storage like files or a database to store the data as it is only refreshed periodically. In many cases one just needs some (potentially expensive) preprocessing, so you might be able to focus on doing the work once and just storing the data in a form so access/delivery to the client is fast.

Advantages:

  • simple
  • no issues with multiple processes (e.g. FastCGI)
  • reduced memory foot print

Do I need to make it Thread-Safe?

That really depends on your usage pattern. However from your API I guess that's not really necessary as you compute a value twice (worst case).

Felix Schwarz
+2  A: 

It seems to me that your cache can grow inefficiently since it will keep entries that are rarely used. Because, it seems that the entries in your cache don't get removed unless a get operation is requested for an specific key.

If you want to improve your cache I'd add the following two simple features:

  1. When an Item is requested I would restart seconds to the initial value. So to keep the elements that your system is often using.
  2. I would implement in a separate thread a mechanism to traverse the cache and delete entries that are too old.

you also can get some ideas from this Fixed size cache

Edited

I just found this recipe, it's super-cool. Basically you can wrapped up with function decorators the logic you want to cache. Something like:

@lru_cache(maxsize=20)
def my_expensive_function(x, y):
    # my expensive logic here
    return result

These LRU and LFU cache decorator decorators will implement for you the cache logic. Least Recently Used (LRU) or Least Frequently Used (LFU) (see Cache_algorithms for reference on these)

msalvadores
@msalvadores nice points, thanks
systempuntoout
@systempuntoout just edited with another solution based on function decorators.
msalvadores
@msalvadores Actually I already have a LRU cache on my project. What I really need is a cache for Items that expires with time.
systempuntoout