I know that when your Google App Engine (GAE) app has 0 instances running (because it has been idle for a bit), and a user requests a page, the user has to wait for the instance to boot up and do all of the instantiation which can cause the user to wait a significant amount of time.
My question is about the situation when your GAE app already has 1 instance running, but begins to experience heavy load and starts booting up a second instance.
In this case, which will happen:
Will a user end up having to wait for the second instance to instantiate before getting their request responded to?
Will no requests be sent to the second instance until it has fully instantiated, thus not making a user wait an extended amount of time?
EDIT: Its unfortunate that currently the answer to this is #1. However, there is a feature request to change the behavior to be #2. Please star this feature request( http://code.google.com/p/googleappengine/issues/detail?id=2690) to help get it to the attention of the App Engine developers