views:

2213

answers:

5

I'm working on a multiplayer game on App Engine and it needs a message queue (i.e., messages in, messages out, no duplicates or deleted messages assuming there are no unexpected cache evictions). Here are the memcache-based queues I'm aware of:

I learned the concept of the memcache queue from this blog post:

All messages are saved with an integer as key. There is one key that has the next key and one that has the key of the oldest message in the queue. To access these the increment/decrement method is used as its atomic, so there are two keys that act as locks. They get incremented, and if the return value is 1 the process has the lock, otherwise it keeps incrementing. Once the process is finished it sets the value back to 0. Simple but effective. One caveat is that the integer will overflow, so there is some logic in place that sets the used keys to 1 once we are close to that limit. As the increment operation is atomic, the lock is only needed if two or more memcaches are used (for redundancy), to keep those in sync.

My question is, is there a memcache-based message queue service that can run on App Engine?

+2  A: 

These might be suitable:

Unfortunately Peafowl looks like it uses constructs that aren't going to work on App Engine.

Here's an alogorithm for a lockless memcache-based queue: http://dev.ionous.net/2009/01/memcached-lockless-queue.html Unfortunately I don't think he did it right; it looks like a new message coming in between the atomic increment and a message write will lose a message.

So I think the best way to do this will be to have a lock on the queue entry message counter and busy wait with the threads that input to the queue when the lock is busy. In my case I'll only have one thread pulling out of the queue so no lock is required, although your mileage may vary.

Still, there ought to be a better way to do this, especially if we don't care about the order in which values come out of the queue. Sharded queues, maybe, to avoid write contention.

Just because it's somewhat related, here's a datastore-based queue: http://github.com/mbuckbee/gae-naive-queue/tree/master

Brandon Thomson
but if you have a high frequent app, GAE might start further instances and you can't prevent further instances of the "writer thread", arent you?
ZeissS
+9  A: 

I would be very careful using the Google App Engine Memcache in this way. You are right to be worrying about "unexpected cache evictions".

Google expect you to use the memcache for caching data and not storing it. They don't guarantee to keep data in the cache. From the GAE Documentation:

By default, items never expire, though items may be evicted due to memory pressure.

Edit: There's always Amazon's Simple Queueing Service. However, this may not meet price/performance levels either as:

  1. There would be the latency of calling from the Google to Amazon servers.
  2. You'd end up paying twice for all the data traffic - paying for it to leave Google and then paying again for it to go in to Amazon.
Dave Webb
I would expect that expiry of items in the cache will be based on inactivity so if the queue lifetimes are as you say then you'd probably be fine. Although there are a lot of qualifiers in that statement so you'd definitely need to handle lost items in the queue.
Dave Webb
A: 

Until Google impliment a proper job-queue, why not use the data-store? As others have said, memcache is just a cache and could lose queue items (which would be.. bad)

The data-store should be more than fast enough for what you need - you would just have a simple Job model, which would be more flexible than memcache as you're not limited to key/value pairs

dbr
Especially with the recent performance problems, it's not going to be cost effective. Under heavy usage I'm expecting to put > 1 million messages per day through this queue. It would be cheaper to use Amazon queue service than the datastore.
Brandon Thomson
Ah, fair enough!
dbr
+1  A: 

If you're happy with the possibility of losing data, by all means go ahead. Bear in mind, though, that although memcache generally has lower latency than the datastore, like anything else, it will suffer if you have a high rate of atomic operations you want to execute on a single element. This isn't a datastore problem - it's simply a problem of having to serialize access.

Failing that, Amazon's SQS seems like a viable option.

Nick Johnson
+3  A: 

I have started a Simple Python Memcached Queue, it might be useful: http://bitbucket.org/epoz/python-memcache-queue/

Etienne Posthumus
I seem to remember seeing this technique before, but +1 for packaging like this.
Joel