views:

219

answers:

1

One of the more common complaints I have read about the AppEngine database (for Java) is that it is extremely slow when it come to "cold start time". What does this mean? And is it something I should be worried about?

+7  A: 

It is something you should be worried about.

Google App Engine spins up a new JVM to service requests when your app has not had any requests for a certain time period. Getting a handle on the datastore from "cold" - i.e. for the first time in a JVM - can take a considerable amount of time, as much as 5+ seconds.

After you have a handle on the datastore (normally an instance of PersistenceManager), everything is fine (for the life of the JVM!).

EDIT:

Spinning up a fresh JVM in GAE-Java is also slow. Read http://code.google.com/appengine/docs/java/datastore/overview.html and you will see that they use a Singleton class for the availability of a PersistenceManagerFactory, as they describe the operation of instantiating one as "expensive".

You could test it out for yourself. Create a brand new application on GAE-Java that merely returns "Hello World!" and you will find that the first request to the application takes a number of seconds.

Add a request for the PersistenceManagerFactory and you will find that the first request takes a few seconds more.

EDIT EDIT:

I have now created this test for your viewing pleasure:

http://stackoverflowanswers.appspot.com/helloworld

You will either instantly see "Hello, world 0" or "Hello, world xxxx" where xxxx is a count in MS of how long it took to get a handle on the datastore. I think that the complexity and number of indexes in the datastore may have an impact on how long it takes to get a handle on the datastore, as it is quicker in this app than in some of my other apps.

PMF is an exact copy of the one provided in the app engine docs.

@SuppressWarnings("serial")
public class HelloWorldServlet extends HttpServlet {
    public void doGet(HttpServletRequest req, HttpServletResponse resp)
            throws IOException {
        long a = System.currentTimeMillis();
        PersistenceManager p = PMF.get().getPersistenceManager();
        long b = System.currentTimeMillis() - a;
        resp.setContentType("text/plain");
        resp.getWriter().println("Hello, world "+b);
    }
}

EDIT EDIT EDIT:

I changed my code so that it instantiates a PersistenceManagerFactory with each request and now it throws 500 server errors, and in the logs:

javax.jdo.JDOFatalUserException: Application code attempted to create a PersistenceManagerFactory named transactions-optional, but one with this name already exists! Instances of PersistenceManagerFactory are extremely slow to create and it is usually not necessary to create one with a given name more than once. Instead, create a singleton and share it throughout your code. If you really do need to create a duplicate PersistenceManagerFactory (such as for a unittest suite), set the appengine.orm.disable.duplicate.pmf.exception system property to avoid this error.

I don't think I need to provide any more proof that getting a handle on the datastore in app engine is SLOW.

Finbarr
Note that this has nothing to do with the datastore - App Engine is a platform.
Nick Johnson
Well, it would seem that the two go hand in hand. App Engine has its own implementation of the Bigtable datastore using JDO or JPA. The datastore is part of the platform.
Finbarr
finbarr, do you have measurements showing that the major delay of cold starts is getting a connection to the datastore? I thought the delay was from starting up a JVM. They python runtime also needs to get a handle to the datastore, and it doesn't have the same cold start issues. This makes me think it isn't the database itself that is slow, but java or java libraries. For example, what happens if you use the low level API vs using JDO?
Peter Recore
@finbarr there is an entire section in the Java FAQ based on performance and "loading requests" (requests which are impacted by the need to start a new JVM) - the issue seems to be starting new JVM instances (and instances of your app), not access to the datastore http://code.google.com/appengine/kb/java.html#What_Is_A_Loading_Request
matt b
Read this: http://code.google.com/appengine/docs/java/datastore/overview.html you will note that they use a singleton class for the `PersistenceManagerFactory` because creating a new instance is an expensive operation. They even make a point of discussing the fact right above the singleton code. It would be trivial to create an application that measures the time taken to instantiate the `PersistenceManagerFactory`, and yes, it is definitely slow - on top of the JVM being slow in the first place.
Finbarr
@Peter, @Matt see http://stackoverflowanswers.appspot.com/helloworld and view my revised answer.
Finbarr
@Finbarr Nice job on the quick test. It looks like jvm startup time and JDO startup time are both culprits in the overall slowness.
Peter Recore
@Peter indeed they are, as I discovered to my dismay a month into the development of an app for app engine.
Finbarr