ansaurus

Question

Python: Downloading a large file to a local path and setting custom http headers

Answer 1

+1 A:

If you want to use urllib and urlretrieve, subclass urllib.URLopener and use its addheader() method to adjust the headers (ie: addheader('Accept', 'sound/basic'), which I'm pulling from the docstring for urllib.addheader).

To install your URLopener for use by urllib, see the example in the urllib._urlopener section of the docs (note the underscore):

import urllib

class MyURLopener(urllib.URLopener):
    pass # your override here, perhaps to __init__

urllib._urlopener = MyURLopener

However, you'll be pleased to hear wrt your comment to the question comments, reading an empty string from read() is indeed the signal to stop. This is how urlretrieve handles when to stop, for example. TCP/IP and sockets abstract the reading process, blocking waiting for additional data unless the connection on the other end is EOF and closed, in which case read()ing from connection returns an empty string. An empty string means there is no data trickling in... you don't have to worry about ordered packet re-assembly as that has all been handled for you. If that's your concern for urllib2, I think you can safely use it.

Jarret Hardie 2009-04-08 01:46:37

Answer 2

A:

What is the harm in writing your own function using urllib2?

import os
import sys
import urllib2

def urlretrieve(urlfile, fpath):
    chunk = 4096
    f = open(fpath, "w")
    while 1:
        data = urlfile.read(chunk)
        if not data:
            print "done."
            break
        f.write(data)
        print "Read %s bytes"%len(data)


urlretrieve(urllib2.urlopen("http://www.google.com"), "d:\\del.html")

Anurag Uniyal 2010-01-08 15:53:49

ansaurus

tags:

views:

answers:

Python: Downloading a large file to a local path and setting custom http headers

related questions