views:

90

answers:

1

hi,

I am reading a "table" in Python in GAE that has 1000 rows and the program stops because the time limit is reached. (So it takes at least 20 seconds.)( Is that possible that GAE is that slow? Is there a way to fix that? Is this because I use free service and I do not pay for it?

Thank you.

The code itself is this:

 liststocks=[]
 userall=user.all() # this has three fields username... trying to optimise by this line
 stocknamesall=stocknames.all() # 1 field, name of the stocks trying to optimise by this line too
 for u in userall: # userall has 1000 users
        for stockname in stocknamesall: # 4 stocks
            astock= stocksowned() #it is also a "table", no relevance I think
            astock.quantity = random.randint(1,100)
            astock.nameid = u.key()
            astock.stockid = stockname.key()
            liststocks.append(astock);
+8  A: 

GAE is slow when used inefficiently. Like any framework, sometimes you have to know a little bit about how it works in order to efficiently use it. Luckily, I think there is an easy improvement that will help your code a lot.

It is faster to use fetch() explicitly instead of using the iterator. The iterator causes entities to be fetched in "small batches" - each "small batch" results in a round-trip to the datastore to get more data. If you use fetch(), then you'll get all the data at once with just one round-trip to the datastore. In short, use fetch() if you know you are going to need lots of results.

In this case, using fetch() will help a lot - you can easily get all your users and stocknames in one round-trip to the datastore each. Right now you're making lots of extra round-trips to the datastore and re-fetching stockname entities too!

Try this (you said your table has 1000 rows, so I use fetch(1000) to make sure you get all the results; use a larger number if needed):

userall=user.all().fetch(1000)
stocknamesall=stocknames.all().fetch(1000)
# rest of the code as-is

To see where you could make additional improvements, please try out AppStats so you can see exactly why your request is taking so long. You might even consider posting a screenshot (like this) of the appstats info about your request along with your post.

David Underhill
Using your suggestion it does not timeout now.
Aftershock
Great! If you want to try to improve performance more, definitely check out [AppStats](http://code.google.com/appengine/docs/python/tools/appstats.html) - it is very informative and pretty easy to use.
David Underhill
+1 for a very good answer. Especially if it gets people to start posting AppStats screenshots.
Peter Recore
This is not the complete solution right? What if the number of rows changes above 1000? Rewriting the code does not seem to be an elegant solution for me.
Aftershock
"a lot" is a bit of an understatement. The OPs code is doing (1000/20)+1000*4 = 4050 datastore operations. The optimized version does 2.
Nick Johnson