I've been investigating the appengine to see if I can use it for a project and while trying to choose between Python and Java, I ran into a surprising difference in datastore query performance: medium to large datastore queries are more than 3 times slower in Python than in Java.
My question is: is this performance difference for datastore queries (Python 3x slower than Java) normal, or am I doing something wrong in my Python code that's messing with the numbers?
My entity looks like this:
Person
firstname (length 8) lastname (length 8) address (20) city (10) state (2) zip (5)
I populate the datastore with 2000 Person records, with each field exactly the length noted here, all filled with random data and with no fields indexed (just so the inserts go faster).
I then query 1k Person records from Python (no filters, no ordering):
q = datastore.Query("Person")
objects = list(q.Get(1000))
And 1k Person records from Java (likewise no filters, no ordering):
DatastoreService ds = DatastoreServiceFactory.getDatastoreService();
Query q = new Query("Person");
PreparedQuery pq = ds.prepare(q);
// Force the query to run and return objects so we can be sure
// we've timed a full query.
List<Entity> entityList = new ArrayList<Entity>(pq.asList(withLimit(1000)));
With this code, the Java code returns results in ~200ms; the Python code takes much longer, averaging >700ms. Both apps are on the same app id (with different versions), so they use the same datastore and should be on a level playing field.
All my code is available here, in case I've missed any details: