views:

454

answers:

2

I'm working on an application that will run on Google AppEngine.

I plan to have the web interface of that application wait, among many other things, for notifications coming from the AppEngine server.

Ideally I would have liked to use an XMLHttpRequest() to make a request to the server that would be waiting until the next notification comes from the application.

However there does not appear to be in AppEngine to support this type of logic (correct me if I'm wrong). This means I appear to be limited to polling at periodic intervals.

So the question is:

  • Does anyone have a good suggestion of how to best design this polling mechanism in order to avoid running into CPU usage quotas of AppEngine? Scalability as the number of "active" clients takes off needs to be considered.

I am specifically interested in suggestions for good management of the polling intervals from the client side and tips for efficient handling of the requests in the AppEngine application as the number of "active" clients grows.

PS: the type of information polled from the server will typically be JSON-encoded information about recently updated/added bits of information (read recently as: in the past few seconds or minutes).

Status Update

Here is a summary of my thoughts so far around this question:

  • To minimize the CPU load required to answer each individual request generated by the polling approach: use the memcache to minimize the time it takes to collect the reply information. Need to find pointers to a good example of that.
  • To minimize the number of requests generated to the server by the "active" clients I have several leads:
    • Make the wait between successive polling requests to the server progressively longer if the user is not actively interacting (i.e. not clicking on anything) within the client web page.
    • Piggy back on other types of requests to the server, that is include the results of the polling requests into other request results to save on the number of requests.

Comments and pointers to code examples welcome!

+3  A: 

You are correct; long-running connections are prohibited on App Engine. The model is request->response->connection closed, and as quickly as possible.

There are certain kinds of applications that aren't feasible on App Engine due to this architecture. If you absolutely need a notification from the server within 5 seconds of an event, for example, your only real choice is to poll the server every 5 seconds. This is probably not practical for large numbers of users.

Requests themselves don't necessarily generate a lot of CPU load. A handler that fetches a memcache key and returns it to the user can easily get under 50ms CPU time, for example. So part of your mission is to reduce the number of requests/min from your clients, but part of it is to ensure your python scripts execute and return as quickly as possible. For this to happen you need to make sure your imports are structured intelligently and do whatever you can to avoid accessing the datastore during user requests.

As far as sample code, can you be a little bit more specific about what you're looking for? For a simple memcache key request, a response to a query can be as simple as:

from django.utils import simplejson
from google.appengine.api import memcache

jsonResponse = {}
jsonResponse['theVal'] = memcache.get(key="testkey")
self.response.out.write(simplejson.dumps(jsonResponse))

Naturally if the memcache data is backed by the datastore, you'll want to take some actions if the key is not found in memcache. Depending on your application, datastore backing may or may not be appropriate.

Brandon Thomson
Thanks for constructive answer. Will get back to you with a more specific request on the sample code I would like. Question about import structured intelligently... does the way imports are structured really impact CPU usage?
Laurent
Yes. For example see http://groups.google.com/group/google-appengine/browse_thread/thread/400c37cc773b9f46 . Most people notice the CPU usage only when using zipimport for large packages like django or jinja2, but every little bit matters. Don't import anything you don't need.
Brandon Thomson
To further minimize request size and time, you can respond with 304 not modified, and skip the message body. When appropriate, of course.
Richard Levasseur
+1  A: 

OR...

You might be interested in some pubsub implementation. Like the venerable pubsubhubbub, made by guys from Google and Jaiku.

zgoda
Hmm, thanks a lot for the hint I'm looking into it and will take the time to digest the thing...
Laurent
Again thank you for that pointer. I don't think it will be applicable in my case, but the Web Hook mechanism which is part of this gave me quite a few ideas.
Laurent