ansaurus

Question

Google App Engine - getting count of records that match criteria over 1000

Answer 1

+2 A:

The behavior of Query.count() is inconsistent with the documentation when no limit is explicitly specified - the documentation indicates that it will count "until it finishes counting or times out." GAE Issue 3671 reported this bug (about 3 weeks ago).

The workaround: explicitly specify a limit and then that value will be used (rather than the default of 1,000).

Testing on http://shell.appspot.com demonstrates this:

# insert 1500 TestModel entites ...
# ...
>>> TestModel.all(keys_only=True).count()
1000L
>>> TestModel.all(keys_only=True).count(10000)
1500L

I also see the same behavior on the latest version of the development server (1.3.7) using this simple test app:

from google.appengine.ext import webapp, db
from google.appengine.ext.webapp.util import run_wsgi_app

class Blah(db.Model): pass

class MainPage(webapp.RequestHandler):
    def get(self):
        for i in xrange(3):
            db.put([Blah() for i in xrange(500)])  # can only put 500 at a time ...
        c = Blah.all().count()
        c10k = Blah.all().count(10000)
        self.response.out.write('%d %d' % (c,c10k))
        # prints "1000 1500" on its first run

application = webapp.WSGIApplication([('/', MainPage)])

def main(): run_wsgi_app(application)
if __name__ == '__main__': main()

David Underhill 2010-09-26 19:24:06

I'll try your solution and see how far I get. The notion that you have to supply a limit to the count is patently absurd, but hopefully it will get resolved soon. Thank you kindly!

etc 2010-09-26 20:10:52

It's not absurd - counting costs O(n) time, and presumably there's an upper limit on how much time you are willing to spend counting?

Nick Johnson 2010-09-27 08:25:19

@David that's weird!? (p.s. Your second example can't work since put in batch is limited to 500)

systempuntoout 2010-09-27 10:54:05

@systempuntoout Good point. Unfortunately, the development server does allow batch puts >500 entities (unlike the production server). I've tweaked the code so it would "work" on the production server too.

David Underhill 2010-09-27 12:14:10

Answer 2

A:

According to this App Engine blog post, the 1000-entity limit has only just been removed for count (and offset) in version 1.3.6. The limit had already been removed for fetch as of version 1.3.1. Upgrade to the latest version and the limit should be removed.

You do not need to cycle through results 1000 at a time (though you could, and it might even be more efficient); simply pass in the maximum number of results you'd like back:

    for m in MyModel.all().fetch(82000):
        # ...

In versions before 1.3.1, the number passed in had to be less than or equal to 1000.

Cameron 2010-09-26 19:27:15

Ideally upgrading to the latest version would be the solution. Unfortunately, there is a bug in the latest version which makes the documentation inconsistent with the behavior - count() will return only 1,000 results unless you explicitly supply a limit greater than 1,000.

David Underhill 2010-09-26 19:56:56

As Mr. Underhill stated, for whatever reason, bug or otherwise, a plain count on a query only produces 1000 even with the latest version.

etc 2010-09-26 20:11:40

ansaurus

tags:

views:

answers:

Google App Engine - getting count of records that match criteria over 1000

related questions