views:

558

answers:

2

I had expected this to work:

>>> import urllib.request as r
>>> import zlib
>>> r.urlopen( r.Request("http://google.com/search?q=foo", headers={"User-Agent": "Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11", "Accept-Encoding": "gzip"}) ).read()
b'af0\r\n\x1f\x8b\x08...(long binary string)'
>>> zlib.decompress(_)
Traceback (most recent call last):
  File "<pyshell#87>", line 1, in <module>
    zlib.decompress(x)
zlib.error: Error -3 while decompressing data: incorrect header check

But it doesn't. Dive Into Python uses StringIO in this example, but that seems to be missing from Python 3. What's the right way of doing it?

+2  A: 

In Python 3, StringIO is a class in the io module.

So for the example you linked to, if you change:

import StringIO
compressedstream = StringIO.StringIO(compresseddata)

to:

import io
compressedstream = io.StringIO(compresseddata)

it ought to work.

lc
Is there any way of side-stepping wrapping the string in a StringIO object?
Andrey Fedorov
+2  A: 

It works fine with io.BytesIO and gzip (gzip and zlib are the same compression but with different headers/"wrapping". Your error has this information in the message)

Excuse the short variable names (just from a quick test)

import io
import urllib.request as r
import gzip

bs = r.urlopen( r.Request("http://google.com/search?q=foo", headers={"User-Agent": "Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11", "Accept-Encoding": "gzip"}) ).read()
bi = io.BytesIO(bs)
gf = gzip.GzipFile(fileobj=bi, mode="rb")
gf.read()
kaizer.se