views:

65

answers:

2

If I should be approaching this problem through a different method, please suggest so. I am creating an item based collaborative filter. I populate the db with the LinkRating2 class and for each link there are more than a 1000 users that I need to call and collect their ratings to perform calculations which I then use to create another table. So I need to call more than 1000 entities for a given link.

For instance lets say there are over a 1000 users rated 'link1' there will be over a 1000 instances of this class for the given link property that I need to call.
How would I complete this example?

class LinkRating2(db.Model):
    user = db.StringProperty()
    link = db.StringProperty()
    rating2 = db.FloatProperty()

query =LinkRating2.all()
link1 = 'link string name'
a = query.filter('link = ', link1)
aa = a.fetch(1000)##how would i get more than 1000 for a given link1 as shown?


##keybased over 1000 in other post example i need method for a subset though not key
class MyModel(db.Expando):
        @classmethod
        def count_all(cls):
            """
            Count *all* of the rows (without maxing out at 1000)
            """
            count = 0
            query = cls.all().order('__key__')
            while count % 1000 == 0:
                current_count = query.count()
                if current_count == 0:
                    break
                count += current_count

                if current_count == 1000:
                    last_key = query.fetch(1, 999)[0].key()
                    query = query.filter('__key__ > ', last_key)

            return count
+1  A: 

Wooble points out that the 1,000 entity limit is a thing of the past now, so you actually don't need to use cursors to do this - just fetch everything at once (it'll be faster than getting them in 1,000 entity batches too since there will be fewer round-trips to the datastore, etc.)

The removal of the 1000 entity limit was removed in version 1.3.1: http://googleappengine.blogspot.com/2010/02/app-engine-sdk-131-including-major.html

Old solution using cursors:

Use query cursors to fetch results beyond the first 1,000 entities:

# continuing from your code ... get ALL of the query's results:
more = aa
while len(more) == 1000:
    a.with_cusor(a.cursor())  # start the query where we left off
    more = a.fetch(1000)      # get the next 1000 results
    aa = aa + more            # copy the additional results into aa
David Underhill
+1  A: 

The 1000-entity fetch limit was removed recently; you can fetch as many as you need, provided you can do so within the time limits. Your entities look like they'll be fairly small, so you may be able to fetch significantly more than 1000 in a request.

Wooble
+1: Wow, I had completely overlooked that improvement! Thanks Wooble.
David Underhill
So what does a fetch all command look like then? If i don't know the size of it and want to fetch everything? I can't put a number and can't leave blank fetch()
You still need to give a limit. If you really want to fetch everything, use an arbitrarily large integer. Note that if you have a huge number of entities, you're not going to be able to fetch them all before you hit a DeadlineExceededError anyway, and a view showing all of your records is unlikely to be very usable for large datasets.
Wooble