views:

80

answers:

2

I have been writing a Google Chrome extension for Stack Exchange (http://goo.gl/iWo0). It's a simple extension that allows you to keep track of your reputation and get notified of comments on Stack Exchange sites.

Currently I've encountered with some issues that I can't handle myself. My extension uses Google App Engine as its back-end to make external requests to Stack Exchange API. Each single client request from extension for new comments on single site can cause plenty of requests to api endpoint to prepare response even for non-skeetish user. Average user has accounts at least on 3 sites from Stack Exchange network, some has > 10!

Stack Exchange API has request limits:
A single IP address can only make a certain number of API requests per day (10,000).
The API will cut my requests off if I make more than 30 requests over 5 seconds from single IP address.

It's clear that all requests should be throttled to 30 per 5 seconds and currently I've implemented request throttle logic based on a distributed lock with memcached. I'm using memcached as a simple lock manager to coordinate the activity of GAE instances and throttle UrlFetch requests.
But I think it's a big failure to limit such powerful infrastructure to issue no more than 30 requests per 5 sec. Such api request rate does not allow me to continue development of new interesting and useful features and one day it will stop working properly at all.
Now my app has 90 users and growing and I need come up with solution how to maximize request rate.

As known App Engine makes external UrlFetch requests via the same pool of different IP's. My goal is to write request throttle functionality to ensure compliance with the api terms of usage and to utilize GAE distributed capabilities.

So my question is how-to provide maximum practical API throughput while complying with api terms of usage and utilizing GAE distributed capabilities.

Advise to use another platform/host/proxy is just useless in my mind.

+2  A: 

First off: I'm using your extension and it rocks!

Have you consider using memcached and caching the results?
Instead of taking the results from the API directly, try first to find them on the cache if they are use it and if they are not: retrieve them and cache them and let them expire after X minutes.

Second, try to batch up users requests, instead of asking the reputation of a single user ask the reputation of several users together.

Shay Erlichmen
@Shay, thanks for using it. Yes, I'm heavily using memcached to cache API response, but even this can't help me to greatly reduce number of api requests. Idea to batch up requests seems me helpful, thanks.
Vladislav Tserman
@Shay @Vlasislav that's kind of funny, me, shay and you are competing for the same quota since all of us have a web application running on GAE. Discovered right now! http://stackapps.com/questions/1708/stackprinter-this-ip-has-exceeded-the-request-per-day-limit
systempuntoout
@Shay @Vlasislav http://stackapps.com/questions/1713/google-app-engine-apps-why-do-you-throttle-just-checking-the-ip
systempuntoout
+3  A: 

If you are searching a way to programmatically manage Google App Engine shared pool of IPs, I firmly believe that you are out of luck.

Anyway, quoting this advice that is part of the faq, I think you have more than a chance to keep on running your awesome app:

What should I do if I need more requests per day?

Certain types of applications - services and websites to name two - can legitimately have much higher per-day request requirements than typical applications. If you can demonstrate a need for a higher request quota, contact us.

EDIT:
I was wrong, actually you don't have any chance.
Google App Engine [app]s are doomed.

systempuntoout
Thanks @systempuntoout, I'm aware of such possibility. I can somehow workaround 10,000 per IP request limit but, actually, I mostly need extended requests "speed", say 30 per sec or more, but anyway, they say "please only request an increased quota when your application is live and has a non-trivial number of users". I'm not sure that 100 and even 1000 is a non-trivial number of users.
Vladislav Tserman
Conserning programmatically managing App Engine pool of IPs, I also started thinking that it's impossible at that moment. Probably I will fill an issue in GAE issue tracker.
Vladislav Tserman
@Vladislav As you said, you should move your business logic to the browser (where IP is not a problem) adopting a mature js library like SOAPI.js for example.
systempuntoout
Yes, all appearance, currently it is the only solution :(
Vladislav Tserman