views:

205

answers:

1

Hey, first time post, I'm really stuck on httplib2. I've been reading up on it from diveintopython3.org, but it mentions nothing about a timeout function. I look up the documentation, but the only thing I see is an ability to put a timeout int but there are no units specified (seconds? milliseconds? What's the default if None?) This is what I have (I also have code to check what the response is and try again, but it's never tried more than once)

h = httplib2.Http('.cache', timeout=None)
for url in list:
    response, content = h.request(url)
    more stuff...

So the Http object stays around until some arbitrary time, but I'm downloading a ton of pages from the same server, and after a while, it hangs on getting a page. No errors are thrown, the thing just hangs at a page. So then I try:

h = httplib2.Http('.cache', timeout=None)
for url in list:
    try:
        response, content = h.request(url)
    except:
        h = httplib2.Http('.cache', timeout=None)
    more stuff...

But then it recreates another Http object every time (goes down the 'except' path)...I dont understand how to keep getting with the same object, until it expires and I make another. Also, is there a way to set a timeout on an individual request?

Thanks for the help!

+1  A: 

Set the timeout to 1, and you'll pretty quickly know if it means one millisecond or one second.

I don't know what your try/except should solve, if it hangs on h.request(url) in one case it should hang in the other.

If you run out of memory in that code, then httplib2 doesn't get garbage collected properly. It may be that you have circular references (although it doesn't look like it above) or it may be a bug in httlib2.

Lennart Regebro