I'm using python to programatically download a zip file from a web server. Using a web browser, it's fine. I've written this (partial) script;
response = urllib2.urlopen(url, data, 10)
the_page = response.read()
f = open(filename, 'w')
f.write(the_page)
f.close()
The request succeeds and I get data. The problem is that the file I'm downloading -- a zip file -- doesn't work; the file appears to be corrupt. It seems to be the right sort of length, and looked at in text editor seems to look like a zip file content. Here are the headers from the download;
Content-Length: 9891 Content-Disposition: Content-Disposition:attachment; filename="TrunkBackup_20101230.zip" Date: Wed, 30 Dec 2009 12:22:08 GMT Accept-Ranges: bytes
When I check the length of the response, it is correct at 9891. I suspect what's happening is that when I call response.read()
the result is a string with carriage returned 'helpfully' normalized (say, \r
to \n
). when I write the file, the binary data is slightly wrong, and the zip file is corrupt.
My problem is (A) I'm not sure if I'm right, and (B) if I am right, how to I save the binary data itself?