views:

112

answers:

2

I'm writing a small program to record reading progress, the data models are simple:

class BookState(db.Model):
    isbn  = db.StringProperty()
    title = db.StringProperty(required=True)
    pages = db.IntegerProperty(required=True)
    img   = db.StringProperty()

class UpdatePoint(db.Model):
    book = db.ReferenceProperty(BookState)
    date = db.DateProperty(required=True)
    page = db.IntegerProperty(required=True)

The UpdatePoint class records how many pages the user has read on corresponding date. Now I want to draw a chart from the data stored in App Engine database, the function looks like this:

book = db.get(bookkey)
ups = book.updatepoint_set
ups.order('date')

for (i, up) in enumerate(ups):
    if i == 0: continue

    # code begin
    days = (up.date - ups[i-1].date).days
    pages = up.page - ups[i-1].page
    # code end

    # blah blah

I find that for a book with about 40 update points, it will costs more than 4 seconds to run the code. And after timing I find the commented code snippet seems to be the root of poor performance. Each loop costs about 0.08 seconds or more.

It seems UpdatePoint is fetched in a lazy way that it won't be loaded until it is needed. I want to know whether there is any better solution to accelerate the data access like fetch the data in a bunch.

Many thanks for your reply.

+3  A: 

It seems I used Query class in a wrong way. I need to call ups.fetch() first to get the data. Now the code is a lot faster than before:

book = db.get(bookkey)
q = book.updatepoint_set
q.order('date')
ups = q.fetch(50)
ZelluX
I thought that is what your updatepoint_set function might be doing as that isnt an appengine call that i have ever seen.
AutomatedTester
Maybe updatepoint_set is the automagically generated back reference you get when using a Reference? http://code.google.com/appengine/docs/python/datastore/entitiesandmodels.html#References
Peter Recore
Damn, you figured it out before I could answer. ;) updatepoint_set is a Query object - and indexing it executes the query afresh each time. Fetching it once gives you an array, as expected.
Nick Johnson
A: 

From the look of the code, it appears that your slow down is because its in the loop and have to kinda pop out to find the object you want. Have you tried something like

i = 0
for up in ups:
  if i != 0:
    days = (up.date - previous.date).days
    pages = up.page - previous.page
  i += 1
  previous = up
AutomatedTester