This question concerns two implementations of counters which are intended to scale without sharding (with a tradeoff that they might under-count in some situations):
- http://appengine-cookbook.appspot.com/recipe/high-concurrency-counters-without-sharding/ (the code in the comments)
- http://blog.notdot.net/2010/04/High-concurrency-counters-without-sharding
My questions:
- With respect to #1: Running
memcache.decr()
in a deferred, transactional task seems like overkill. Ifmemcache.decr()
is done outside the transaction, I think the worst-case is the transaction fails and we miss counting whatever we decremented. Am I overlooking some other problem that could occur by doing this? - What are the significiant tradeoffs between the two implementations?
Here are the tradeoffs I see:
- #2 does not require datastore transactions.
- To get the counter's value, #2 requires a datastore fetch while with #1 typically only needs to do a
memcache.get()
andmemcache.add()
. - When incrementing a counter, both call
memcache.incr()
. Periodically, #2 adds a task to the task queue while #1 transactionally performs a datastore get and put. #1 also always performsmemcache.add()
(to test whether it is time to persist the counter to the datastore).
Conclusions
(without actually running any performance tests):
- #1 should typically be faster at retrieving a counter (#1 memcache vs #2 datastore). Though #1 has to perform an extra
memcache.add()
too. - However, #2 should be faster when updating counters (#1 datastore get+put vs #2 enqueue a task).
- On the other hand, with #1 you have to be a bit more careful with the update interval since the task queue quota is almost 100x smaller than either the datastore or memcahce APIs.