views:

81

answers:

3

I know this is simple but I couldn' figure this out, I am fetching all the instances for a given linkname but I want to call all their values(rating2) to perform a calculation, what I realized though is in looping through I believe this is making an individual call each time, (its slow) it takes 2 seconds for only 100 instances of the LinkRating2 class. So how would I call all of the rating2 values for a given linkname without a loop and populate a dictionary? Or quite frankly make this code faster?

class LinkRating2(db.Model):
    user = db.StringProperty()
    link = db.StringProperty()
    rating2 = db.FloatProperty()

def sim_distance(link1,link2,tabl):
    # Get the list of shared_items
    si={}
    query = tabl.all()
    query2 = tabl.all()

    a = query.filter('link = ', link1)
    b = query2.filter('link = ', link2)
    adic ={}
    bdic= {}
    ##populate dics
    aa = a.fetch(10000)
    bb = b.fetch(10000)

    for itema in aa:
        adic[itema.user]=itema.rating2

    for itemb in bb:
        bdic[itemb.user]=itemb.rating2

EDIT:

ok I debugged and realized the loop is taking essentially 0 seconds, all my time is in the query and fetch lines, I only have a table with 100 items and it is taking 2 seconds!!!!! How can it be this slow to fetch a few items out of a 100 table and how can I speed this up?

A: 

If you want to get rid of the loop, you can use

adic = zip([itema.user for itema in aa],[itema.rating2 for itema in aa])
bdic = zip([itema.user for itema in bb],[itema.rating2 for itema in bb])

This won't necessarily make your code faster. If you just want to improve performance, look at the psyco package, or look here.

Ben Gartner
Can you use psyco on GAE?
Mattias Nilsson
Good point, you probably can't.
Ben Gartner
+3  A: 

Your app isn't making any more calls than it needs to be. The only RPCs occur when you do the .fetch() operations. Any source of slowness is likely elsewhere.

Nick Johnson
+1  A: 

If your concern is that an RPC is firing inside each loop iteration, I don't think it would be. You're using fetch to eager load your entities, and your model has no reference properties, so you should be doing exactly two queries, and no gets.

To track RPC volume and timing empirically, use Guido's Appstats framework. That will show you the total runtime of each script, and how much of it is consumed by RPC execution. You could also put a logging.debug before and after your loops to confirm that they run quickly.

Drew Sears
+1 for AppStats. Nothing like having some hard data to act on.
Adam Crossland