tags:

views:

26

answers:

3

I've got a simple function to get some additional data based on request.user:

def getIsland(request):
 try:
  island = Island.objects.get(user=request.user) # Retrieve
 except Island.DoesNotExist:
  island = Island(user=request.user) # Doesn't exist, create default one
  island.save()
 island.update() # Run scheduled tasks
 return island # Return

The problem is that the function gets called in many different places (middleware, templates, views ETC) and thus executes the query many times. Any way to help that? ie

def getIsland(request):
    if HasBeenEvaluatedAlreadyOnThisRequest: return cached
    else:
        [...]
+1  A: 

Quick and dirty:

def getIsland(request):
    if hasattr(request, "_cached_island"):
        return request._cached_island
    try:
        island = Island.objects.get(user=request.user) # Retrieve
    except Island.DoesNotExist:
        island = Island(user=request.user) # Doesn't exist, create default one
        island.save()
    island.update() # Run scheduled tasks
    request._cached_island = island
    return island # Return
Ned Batchelder
this is bad bad bad, no shared state between processes, not even between requests
Swizec Teller
I'm not sure what you are getting at. This doesn't create any share state between requests. And besides, caching is inherently shared state, the whole point is to share costly results. Perhaps you could elaborate?
Ned Batchelder
Well you seem to be adding a property to the request object. But the request object is new whenever somebody makes a new request. The request object also isn't shared between different servers on a distributed environment. Therefore a much better way of doing this is using django's cache backend and memcache or something similar.
Swizec Teller
A: 

If you have multiple processes running or multiple computers hitting the same database, then of course there is no way for you to reduce the number of queries running this.

One thing you can try doing is using a threadlocal storage to hold a global "cache" of users. As an example:

class UserStorage(threading.local):
    store = {}
    def getIsland(self, request):
        user_id = request.user.pk
        island = store.get(user_id)
        if island is None:
            island, created = Island.objects.get_or_create(user = user_id)
            store[user_id] = island
        island.update()
        return island

However, you may notice that the Island object WILL NEVER BE UPDATED. Therefore, you must proceed with extreme caution. You may need to have a global timeout for this object, but then you are implementing your own cache solution, so why not use django's cache system with memcached or their threadlocal cache?

Mike Axiak
+1  A: 

Have you tried about using cache?

Django has a wonderful cache system: http://docs.djangoproject.com/en/dev/topics/cache/

This would make your function look something like so:

def getIsland(request):
 island = cache.get("island_"+request.user)
 if island == None:
   try:
    island = Island.objects.get(user=request.user) # Retrieve
   except Island.DoesNotExist:
    island = Island(user=request.user) # Doesn't exist, create default one
    island.save()
   island.update() # Run scheduled tasks
   cache.set("island_"+request.user, island, 60)
 return island # Return

You will likely need to do some serialisation and deserialisation when caching stuff, but that's the general gist of it. The benefit is that the result of your query is now stored in RAM for x seconds and it doesn't matter which specific process accesses it. It's always there. Available for everyone.

Swizec Teller