views:

1721

answers:

3

How would I go about implementation the queries required for pagination?

Basically, when page 1 is requested, get the first 5 entries. For page 2, get the next 5 and so on.

I plan to use this via the couchdb-python module, but that shouldn't make any difference to the implementation.

+1  A: 

This is what I have came up with so far - to get the ids of all posts, then retrieve the actual items for the first x number of IDs..

It's not terribly efficient, but more so than retrieving all the posts, then throwing most of the away. That said, to my surprise, it seemed to run quite quickly - I ran the posthelper.page() method 100 times and it took about 0.5 seconds.

I didn't want to post this in the actual question, so it wouldn't influence the answers as much - here's the code:

allPostsUuid = """
function(doc) {
if(doc.type == 'post'){
    emit(doc._id, null);
}
}
"""

class PostsHelper:
    def __init__(self):
        server = Server(config.dbhost)
        db = server[config.dbname]
        return db


    def _getPostByUuid(self, uuid):
        return self.db.get(uuid)

    def page(self, number = 1):
        number -= 1 # start at zero offset
        start = number * config.perPage
        end = start + config.perPage

        allUuids = [
            x.key for x in self.db.query(allPostsUuid)
        ]
        ret = [
            self._getPostByUuid(x) for x in allUuids[start : end]
        ]

        if len(ret) == 0:
            raise Error404("Invalid page (%s results)" % (len(allUuids)))
        else:
            return ret
dbr
+7  A: 

The CouchDB HTTP View API gives plenty of scope to do paging efficiently.

The simplest method would use startkey and count. Count is the max number of entries CouchDB will return for that view request, something that is up to your design, and startkey is where you want CouchDB to start. When you request the view it will also tell you how many entries there are, allowing you to calculate how many pages there will be if you want to show that to users.

So the first request would not specify a startkey, just the count for the number of entries you want to show. You can then note the key of the last entry returned and use that as the start key for the next page. In this simple form, you will get an overlap, where the last entry of one page is the first of the next. If this is not desirable it is trivial to simply not display the last entry of the page.

A simpler method of doing this is to use the skip parameter to work out the starting document for the page, however this method should be used with caution. The skip parameter simply causes the internal engine to not return entries that it is iterating over. While this gives the desired behaviour it is much slower than finding the first document for the page by key. The more documents that are skipped, the slower the request will be.

kerrr
Aha! From that page you linked: the count parameter can be combined with the "skip=number of rows to skip". Perfect.
dbr
I've added the above info to your answer (for my reference if nothing else), hope you don't mind!
dbr
I edited it again. Using skip is not a good way of doing this is most cases.
kerrr
Ah, I thought this may be the case.. Is there a quicker way to find the x-th key?
dbr
No, to find the n-th entry you need to iterate over the index tree because you don't know how many entries a branch in the tree has. You can find a specific key much faster.
kerrr
+2  A: 

There's this utility library at github.com that abstracts you from the pagination work: http://github.com/cpinto/python-couchdb-paginator/tree/master

cpinto