i'm using
data=urllib2.urlopen(url).read()
i want to know:
how to know a url is gzipped
dose urllib2 will automaticly uncompress the gzipped data if a url is gzip,so the data is always a string?
i'm using
data=urllib2.urlopen(url).read()
i want to know:
how to know a url is gzipped
dose urllib2 will automaticly uncompress the gzipped data if a url is gzip,so the data is always a string?
This checks if the content is gzipped and decompresses it:
from StringIO import StringIO
import gzip
response = urllib2.urlopen(request)
if response.info().get('Content-Encoding') == 'gzip':
buf = StringIO( response.read())
f = gzip.GzipFile(fileobj=buf)
data = f.read()
If you are talking about a simple .gz
file, no, urllib2 will not decode it, you will get the unchanged .gz
file as output.
If you are talking about automatic HTTP-level compression using Content-Encoding: gzip
or deflate
, then that has to be deliberately requested by the client using an Accept-Encoding
header.
urllib2 doesn't set this header, so the response it gets back will not be compressed. You can safely fetch the resource without having to worry about compression (though since compression isn't supported the request may take longer).