views:

227

answers:

5

After thinking quite a while about how to make a fast and scalable web application, I am almost decided to go for a combination of Google App Engine, Python+Django, and app-engine-patch. But I came across a comment in the app-engine-patch FAQ that made me think that perhaps the combination is not quite as mature as I thought: it may take seconds (1-4, according to the FAQ) to boot an instance of Django. That may not be a problem if there is some persistance from request to request, but it seems that when there is no sustained traffic then the Django instance is shut down in a few seconds. If the system is not called every other second or so, any incoming request will take seconds(!) to be granted. This is unacceptable. As a quick fix (ugly, I know), I was thinking about having an external machine making a dummy request to the framework every second just to keep it alive.

Do you agree with this? Do you have any other approach?

Another doubt that I have is what will happen if there is enough traffic to jump from one n servers to n+1, will that request take seconds to be granted because a new Django instance has to be initiated? or Google's infrastructure doesn't work this way? I confess my ignorance on this. issue.

Help!

+1  A: 

I respect what you are trying to do, but this sounds a little like pre-mature optimization to me. The py+django patch you are discussing is recommended by Google until they upgrade to "real" django so I can't imagine it's all that bad. It's also not that hard to test the performance of what you are talking about, so I suggest you do that and run a few metrics on it first before making your final decision. That way you'll have some math to back it up when someone else starts complaining ;)

slf
"until they upgrade to "real" django"? The current solution isn't "unreal" Django - the patches are only necessary due to Django's limitations.
Nick Johnson
+3  A: 

Yes, long startup times are a caveat of using a framework with a lot of code. There's no way around them, currently, other than using a framework that is lighter-weight (such as the built in webapp framework).

Polling your app isn't recommended: It'll use up quota, and doesn't actually guarantee that the real user requests hit the same instance your polling requests did, since apps run on multiple instances.

Fortunately, there's a simple solution: Get popular! The more popular your app is, the less frequently instances need restarting, and the smaller a proportion of users it affects.

Nick Johnson
Thank you Nick. I agree with what you say. To get popular is the real solution. My proposed fix was to avoid a catch-22 situation in which you don't get popular because the page sucks, and it sucks because it doesn't get popular!
cpicada
Yup, fair cop - that is a legitimate concern. Not much to be done about it in the short-term, though. I have high hopes for Unladen Swallow, however. ;)
Nick Johnson
Getting popular isn't simple.
dfrankow
A: 

Also, it seems to me (but Nick can correct me here if I'm wrong) that if you use the built in Django (.97 or 1.0) the loading is less of a problem. Logically, I'd say they keep the built-in libs in memory for everyone, or share that cached code between instances. But I don't know for sure.

Koen Bok
Update: for now I included my own Django again, because the built in 1.0 gave me weird timeout issues.
Koen Bok
A: 

See Takashi Matsuo's comparisons. Basically, for simplest app-engine-patch that does almost nothing, he claims about ~1s versus ~350ms for webapp+Django templates.

It feels like longer than 1s for our app, but Takashi just tried the very simplest app he could think of.

dfrankow
+1  A: 

They also mention in the FAQ that using a zipped version of Django will help the load time, although I'm guessing it might still be long. As for your original question, I'd agree with others that polling your app is probably not a good idea because it likely won't solve your problem because Google may distribute your requests across many machines, etc, etc.

Bialecki