views:

661

answers:

2

How can I cache a Reference Property in Google App Engine?

For example, let's say I have the following models:

class Many(db.Model):
    few = db.ReferenceProperty(Few) 

class Few(db.Model):
    year = db.IntegerProperty()

Then I create many Many's that point to only one Few:

one_few = Few.get_or_insert(year=2009)
Many.get_or_insert(few=one_few)
Many.get_or_insert(few=one_few)
Many.get_or_insert(few=one_few)
Many.get_or_insert(few=one_few)
Many.get_or_insert(few=one_few)
Many.get_or_insert(few=one_few)

Now, if I want to iterate over all the Many's, reading their few value, I would do this:

for many in Many.all().fetch(1000):
  print "%s" % many.few.year

The question is:

  • Will each access to many.few trigger a database lookup?
  • If yes, is it possible to cache somewhere, as only one lookup should be enough to bring the same entity every time?


As noted in one comment: I know about memcache, but I'm not sure how I can "inject it" when I'm calling the other entity through a reference.

In any case memcache wouldn't be useful, as I need caching within an execution, not between them. Using memcache wouldn't help optimizing this call.

+1  A: 

The question is:

  1. Will each access to many.few trigger a database lookup? Yes. Not sure if its 1 or 2 calls
  2. If yes, is it possible to cache somewhere, as only one lookup should be enough to bring the same entity every time? You should be able to use the memcache repository to do this. This is in the google.appengine.api.memcache package.

Details for memcache are in http://code.google.com/appengine/docs/python/memcache/usingmemcache.html

AutomatedTester
Thanks AutomatedTester.I know about memcache, but I'm not sure how I could "inject it" when I'm calling the other entity through a reference.
Fh
double checking: In any case memcache wouldn't be useful, as I need caching within an execution, not between them. Using memcache wouldn't help optimizing this call (it would make it even slower) :(
Fh
Actually, a memcache fetch takes less time than a datastore fetch - so it would still help.
Nick Johnson
+5  A: 

The first time you dereference any reference property, the entity is fetched - even if you'd previously fetched the same entity associated with a different reference property. This involves a datastore get operation, which isn't as expensive as a query, but is still worth avoiding if you can.

There's a good module that adds seamless caching of entities available here. It works at a lower level of the datastore, and will cache all datastore gets, not just dereferencing ReferenceProperties.

If you want to resolve a bunch of reference properties at once, there's another way: You can retrieve all the keys and fetch the entities in a single round trip, like so:

keys = [MyModel.ref.get_value_for_datastore(x) for x in referers]
referees = db.get(keys)

Finally, I've written a library that monkeypatches the db module to locally cache entities on a per-request basis (no memcache involved). It's available, here. One warning, though: It's got unit tests, but it's not widely used, so it could be broken.

Nick Johnson
I understand that the first time you call model1.reference, it will load the reference and afterwards it will be loaded. My problem is when I call model2.reference - If both reference point to the same Entity, will App Engine catch this and stop before going again to the datastore?
Fh
I checked the module, seems it does the memcache part. But memcache won't be useful for the exposed case :(. Thanks for the reference anyway, it will be useful.
Fh
No, App Engine will make a second round-trip. The caching library will turn that into a memcache round-trip, which is an improvement - but it could certainly use enhancing to cache locally on a per-request basis, too. I've written such a library - I'll update my answer.
Nick Johnson
That library looks dangerously attractive. Haven't you find any bugs since Jan 5? (Thanks!)
Fh
I haven't used it extensively myself, I'm sorry to say.
Nick Johnson