views:

202

answers:

2

I am trying to figure out the best way to deploy a single google app engine application across multiple regions.

The same code is to be used, but the stored data is specific to each region. Motivating examples are hyperlocal review sites, like yelp.com or urbanspoon, where restaurants and other businesses to review are specific to a region (e.g. boston.app.com, seattle.app.com).

A couple options include:

  1. Create multiple gae applications, and duplicate the code across them.

  2. Create a single gae application, and store all data for all regions in the same datastore, with a region identifier field for each model delimiting the relevant region

Some of the tradeoffs:

  1. option 2 seems like it will be increasingly inefficient (space: replicating a region identifier for each record of every model; time: filtering/indexing on the identifier for every query)

  2. option 1 requires an app id for every region, while gae only allows 10 apps per account. Moreover, deploying the code across every region, as well as datastore migration, seems like it could be a pain to manage.

In the ideal world, I would have a single application instance. From that instance, I could route between subdomains (like here), as well as have a separate datastore for each subdomain. But I believe gae only allows a single datastore per application.

Does anyone have ideas on the best way to solve this problem? Or options that I am not considering?

Thanks for your time!

+1  A: 

I would recommend your approach #2. Storage space is cheap (and region codes are short), and datastore performance does not degrade with size, unlike most databases. Using a single app also makes for easier management and upgrades, and avoids any issues with the TOS (which prohibit sharding your app to avoid billing charges).

Nick Johnson
Thanks Nick. I hadn't even been considering the sharding issue. I went with approach #2 and its working great thus far.
Travis Kriplean
A: 

If you use source code revision control, then it is not too bad to push identical code into multiple apps. You could set a policy whereby only full-fledged tags are allowed to be pushed up to GAE. Another option is to make your application version the same as the revision number.

With App Engine, I (and I believe most others) always migrate data from within my model code. You can't easily do bulk migrations in GAE and the usual solution is to migrate data as you come across it in code. In this way, you can keep your models pretty much identical across applications.

Having said that, I would probably still go with a unified application. It's more future-proof. What if users want to join their L.A. identity and their New York identity? Or what if an advertiser offers you a sweet deal for you to run some marketing reports on your own data?

Finally, a few bytes of data doesn't matter so much on App Engine. As your site grows, you will very quickly discover that you will always be bumping into ceilings. GAE limits are extremely small compared to a traditional web server and so you will have to work within those limits anyway. For example, you can only fetch 1,000 records at a time. So your architecture will already support a piecemeal paging solution. So don't worry too much about an extra field or two in your record.

jhs
I agree -- after having thought about it more and after reading Nick's answer above, it seems that #2 is a superior solution. Your point about identities across regions is an important factor as well.
Travis Kriplean