What is the performance cost of named keys or "pre-generated" keys in Google App Engine?

If you used named keys in Google App Engine, does this incur any additional cost? Put another way, is it any more expensive to create a new entity with a named key rather than a randomly generated id?

In a similar line of reasoning, I note that you can ask Google App Engine to give you a set of keys that will not be used by Google App Engine as auto generated keys? Would generating a large number of these keys result in reduced performance?

These questions both bother me for the following reason. Let us say Google App Engine was attempting to persist entity A, and as such it is creating a key for A. It would seem intuitively, that when a new key is randomly generated, Google App Engine would need to first check if the key was already in existence. If the key already existed, then Google App Engine might need to generate another randomly generated new key. It would continue to do this until it succeeded in generating a unique new key. It would then assign this key to entity A. Alright, that is fine and good.

My problem with this is it seems to imply that keys cause some sort of application level lock? This would be neccesary when Google App Engine is checking if the randomly generated key already exist. This can't be right, as it isn't scalable at all? What is wrong about my reasoning?

So, since this was long, I will re-iterate my 3 questions:

Does Google App Engine create an application level lock when generating new keys?
Do named keys incur any additional cost over automatically generated keys? If so, what cost (constant, linear, exponential,...)?
Does asking app engine for keys that app engine promises not to use cause a degradation in key creation performance? If so, what would the cost for this be?

There is no intrinsic penalty to using a key name instead of an auto-generated ID, except the overhead of a (potentially) longer key on the entity and any ReferenceProperties that reference it.

In certain cases, in fact, using auto-allocated IDs can have a performance penalty: If you insert new entities at a very high rate (several hundred per second), since all the new entities have IDs in the same range, they will all be written to the same Bigtable tablet, and can cause contention and increased timeouts. The vast majority of apps never have to worry about this, though.

There's no performance impact to allocating as many IDs as you want - App Engine simply increases the ID counter by the number you request. (This is a simplification, but generally accurate).

In answer to your concerns, App Engine doesn't randomly generate keys. It either uses an auto-allocated id, which is allocated using a counter, and thus guaranteed unique, or it uses the key you supplied. So in answer to your last 3 bullet points:

No.
Only in storage for the (potentially) longer keys
No, and the cost is roughly O(1) regardless of how many you ask for.

Really. That is interesting. Don't quite know about the Bigtable tablet stuff. But ignoring that, then it just raises the question up one level. How is this counter incremented on a application wide level without locking the whole application for a period of time?

Stephen Cagle 2009-11-26 21:58:56

The current maximum value for the counter is stored in Bigtable. Each appserver caches counter values in blocks, so when you request an ID (or create an entity), the vast majority of the time, the appserver is able to allocate one from its locally cached set. When it runs out, it simply updates the bigtable, allocating itself another large block.

Nick Johnson 2009-11-26 22:13:30

ansaurus

tags:

views:

answers:

What is the performance cost of named keys or "pre-generated" keys in Google App Engine?

related questions