A recent question regarding the datastore and how long a query should run got me thinking - has anyone compiled a nice set of benchmarks that would demonstrate what "typical" results should be for datastore performance? I know that every entity kind will have different performance characteristics, but it would be great to be able to see times for a few representative types of entities, so we know if we're doing something wrong (for example, if we see that our query is taking much longer than the benchmark, we might know to check for non lazily fetched relationships, or to verify that we are using the api properly to batch fetch things.)
I think that you can check on the system status page the "typical" latencies for some transactions, but due to the load balancing, on a real application, this time can vary a lot, depending on the application current load. Indexes, entity size, number of registers, etc would also affect the results. It's hard to make a comprehensive set of tests.
I think that the best way to have an idea of how an application will perform under load is to do a load test according to the tips of this article: code.google.com/appengine/articles/load_test.html
Recently I have profiled simple App engine Data Store Operations using AppWrench. I was mostly interested in Api cost but I have measured system time to.
Surprisingly the results was very repeatable. And allowed me to build simple math model which seems to work well.
In a few words it seems that Data store api cost is calculated in following way(at least for bulk operations):
Get: 10 API megacycles - per entity (does not depends from entity size)
Put: 48 API megacycles - per entity if transaction is started(does not depends from entity size). If transaction is not used then you should add cost of 'Commit' to cost of 'Put' to get right score.
Delete: 0 API megacycles per entity (in other words delete is free)
Commit: 48 API megacycles -per entity + 20 megacycles for each new indexed property value and 40 megacycles for each changed indexed property value.
My benchmark only uses simple automatic one field indexes so I may only suggest that complex indexes also add per entity penalty(going to profile it later).
System time: slowly increased with size and complexity of transaction but in general was between 200 and 500 milliseconds.
So basically conclusions are :
- cost is ~ entity count
- cost does not depends from entity size
- cost of storing data is heavily affected by indexes.
- delete costs zero
Regards, Pavel