ansaurus

Question

Answer 1

+1 A:

I'd suggest using

positions = ((n, n + hits_per_page - 1) for n in xrange(1, total_hits, hits_per_page))
for start, end in positions:

and then not worry about whether end exceeds hits_per_page unless the API you're using really cares whether you request something out of range; most will handle this case gracefully.

P.S. Check out httplib2 as a replacement for the urllib/urllib2 combo.

Hank Gay 2010-05-05 00:59:04

A slice of fried gold, thank you. I doff my hat. Now, how do I 'virtually' doff my hat to your excellent input?

craigs 2010-05-05 01:14:22

The upvote was a nice start ;-)

Hank Gay 2010-05-05 11:48:27

Answer 2

+1 A:

It might be interesting to use some kind of generator for this scenario to iterate over the list.

def getitems(base_url, per_page=100):
    content = ...urllib...
    total_hits = get_total_hits(content)
    sofar = 0
    while sofar < total_hits:
        items_from_next_query = ...urllib...
        for item in items_from_next_query:
            sofar += 1
            yield item

Mostly just pseudo code, but it could prove quite useful if you need to do this many times by simplifying the logic it takes to get the items as it only returns a list which is quite natural in python.

Save you quite a bit of duplicate code also.

xyld 2010-05-05 01:02:17

ansaurus

tags:

views:

answers:

Paginating requests to an API

related questions