views:

59

answers:

1

Should I care about locality of entities on the Google App Engine datastore? Should I use custom entity key names for that?

For example, I could use "$article_uuid,$comment_id" as the key name of a Comment entity. Will it improve the speed of fetching all comments for an article? Or is it better to use shorter keys?

Is it a good practice to use the key in this way? I could use the "$article_uuid,$comment_id" key name also instead of an index:

def get_comments(article_uuid, limit=1000):
    key_prefix=db.Key.from_path('Comment', article_uuid)
    q = Comment.gql("where __key__ > :key_prefix and __key__ < :range_end",
        key_prefix=key_prefix, range_end=key_prefix+chr(ord(',')+1))
    return q.fetch(limit)
+1  A: 

The locality of your data will be improved with your key_name scheme (ref, see slide 40) - since your key_name is prefixed with the corresponding article's ID, comments for a given article should be stored near each other.

The key_name you proposed doesn't seem like it would be too long. I don't think you'll see too much difference between that and shorter keys in terms of storage space or serialization/deserialization time. I expect that the size of the Comment entity will be dominated by the rest of the entity.

David Underhill
Unfortunately, I can't seem to dig up a copy of either a transcript or a video of that presentation. If you're interested in reading more, there are a lot of good articles [here](http://code.google.com/appengine/articles/).
David Underhill
Yes, entities with common prefixes are likely to end up on the same tablet - but that doesn't mean they'll be loaded any faster. Entities on different tablets are likely to be on different tabletservers, and so will be loaded in parallel from each server.
Nick Johnson