views:

81

answers:

2

When starting with any new major library or system, I go to StackOverflow for the "What should I know?" questions. The answers might be subjective, but the advice usually saves me many hours of trouble. So far, I have burned a number of hours on Google App Engine tripping over the same issues that more experienced developers here already know.

I eventually found these common issues:

  • appcfg.py uploads do not appear on the Google Dashboard until you select them from the Versions tag.
  • Using cron jobs for keeping an application from being unserved is necessary if you want consistent response time. This gets to be a "tragedy of the commons issue" (Thanks Nick).
  • PyDev in Eclipse works well with Google App Engine.
  • Getting a local version of Python 2.5 for Ubuntu 10.04 is hard. Or you can "sudo add-apt-repository ppa:fkrull/deadsnakes" to get it.
  • Use VirtualEnvWrapper to isolate your Python2.5 for GAE from other versions of Python used for everything else.
  • Applications on appspot only switch to newly updated versions when inconvenient, regardless of what the control panel says. You should keep a version number in the title or footer to avoid wild goose-chases.

So, what else should I know?

+3  A: 

There's an awful lot to cover. If you have a specific area you're interested in, perhaps we can offer something more specific. In general terms, I'll use this opportunity to plug my blog, which has a lot of App Engine material.

I would take issue with one of your points, however:

Using cron jobs for keeping an application from being unserved is necessary if you want consistent response time.

First of all, this is a 'tragedy of the commons' issue. Apps are unscheduled when they're idle to make way for apps that are serving traffic; a bunch of people running 'keepalive' cronjobs forces all apps to be unloaded faster, leaving everyone worse off.

Second, you're always going to get occasional loading requests, even with a keepalive cron job. Additional instances of your app are scheduled whenever required, so whenever you get a surge of traffic this will happen, and someone will inevitably get a loading request.

Finally, loading requests don't need to be a big issue. Particularly with the recent Python precompilation support we recently added, loading requests don't have to take a huge amount of time, and work you spend on optimization will benefit all your users.

Nick Johnson
It does seem to be a tragedy of the commons. Still, who wants the app take takes ten seconds to load between pages?
Charles Merriam
Your app won't take 10 seconds unless you're importing a truly spectacular amount of code. Reducing this is a good optimization strategy, and will improve all requests, not just loading requests.
Nick Johnson
+1  A: 

If you're using a "for/in" query, such as: "Find events created today by someone in this list of users", you would find that such a query does not scale, as the datastore converts it into n queries, where n is the size of "users".

To get around this issue, I assign computable key names. In the example above, the key name for an event would be:

event_<dd/mm/yy>_<user_key>

This way, you can compute all the possible key names for entities given today's date. Once you have this list, you can use:

db.get_by_key_name(key_names) 

... which fetches entities in parallel, and is much faster than using an "IN" query!

mahmoud