views:

112

answers:

4

Something I'm curious about.. What would be "most efficient" to cache the generation of, say, an RSS feed? Or an API response (like the response to /api/films/info/a12345).

For example, should I cache the entire feed, and try and return that, as psuedo code:

id = GET_PARAMS['id']
cached = memcache.get("feed_%s" % id)
if cached is not None:
    return cached
else:
    feed = generate_feed(id)
    memcache.put("feed_%s" % id, feed)
    return feed

Or cache the queries result, and generate the document each time?

id = sanitise(GET_PARMS['id'])
query = query("SELECT title, body FROM posts WHERE id=%%", id)

cached_query_result = memcache.get(query.hash())
if cached_query_result:
    feed = generate_feed(cached_query_result)
    return feed
else:
    query_result = query.execute()
    memcache.put("feed_%s" % id, query_result)
    feed = generate_feed(query_result)

(Or, some other way I'm missing?)

+1  A: 

Depends on the usage pattern, but all things equal I'd vote for the first way because you'll only do the work of generating the feed 1 time.

Kyle Boon
+1  A: 

It really depends on what your app does... The only way to answer this is to get some performance numbers from your existing app. Then you can find the code that takes the largest amount of time and work on improving that one.

David
+2  A: 

As for my experience, You should use multiple levels of cache. Implement both of Your solutions (provided that it's not the only code that uses "SELECT title, body FROM posts WHERE id=%%". If it is use only the first one).

In the second version of code, You memcache.get(query.hash()), but memcache.put("feed_%s" % id, query_result). This might not work as You want it to (unless You have an unusual version of hash() ;) ).

I would avoid query.hash(). It's better to use something like posts-title-body-%id. Try deleting a video when it's stored in cache as query.hash(). It can hang there for months as a zombie-video.

By the way:

id = GET_PARMS['id']
query = query("SELECT title, body FROM posts WHERE id=%%", id)

You take something from GET and put it right into the sql query? That's bad (will result in SQL injection attacks).

Reef
"Implement both" is a good point.. The code was Python-looking pseudo-code (which I made up for the question. If the code were real, obviously id would be validated and the query() function would properly escape the parameter)
dbr
Still, posting this might confuse newbies. Always better to add a sanitize(GET_PARAMS['id']) or something. Good question, btw.
Reef
+1  A: 

As others have suggested here I'd profile your code and work out what is the slowest or most expensive part of the operation.

James C