ansaurus

Question

speed up calling lot of entities, and getting unique values, google app engine python

Answer 1

+3 A:

1) One trick to make this query fast is to denormalize your data. Specifically, create another model which simply stores a link as the key. Then you can get a list of unique links by simply reading everything in that table. Assuming that you have many LinkRating2 entities for each link, then this will save you a lot of time. Example:

class Link(db.Model):
    pass  # the only data in this model will be stored in its key

# Whenever a link is added, you can try to add it to the datastore.  If it already
# exists, then this is functionally a no-op - it will just overwrite the old copy of
# the same link.  Using link as the key_name ensures there will be no duplicates.
Link(key_name=link).put()

# Get all the unique links by simply retrieving all of its entities and extracting
# the link field.  You'll need to use cursors if you have >1,000 entities.
unique_links = [x.key().name() for Link.all().fetch(1000)]

Another idea: If you need to do this query frequently, then keep a copy of the results in memcache so you don't have to read all of this data from the datastore all the time. A single memcache entry can only store 1MB of data, so you may have to split your links data into chunks to store it in memcache.

2) It is faster to use fetch() instead of using the iterator. The iterator causes entities to be fetched in "small batches" - each "small batch" results in a round-trip to the datastore to get more data. If you use fetch(), then you'll get all the data at once with just one round-trip to the datastore. In short, use fetch() if you know you are going to need lots of results.

David Underhill 2010-05-24 23:18:37

ansaurus

tags:

views:

answers:

speed up calling lot of entities, and getting unique values, google app engine python

related questions