ansaurus

Question

Answer 1

+2 A:

It's due to an imperfection in the pseudofile object implemented by the zipfile module (for the .open method of the ZipFile class introduced in Python 2.6). Consider:

>>> f = zf.open('data.pkl')
>>> f.read(1)
'('
>>> f.readline()
'dp1\n'
>>> f.read(1)
''
>>>

the sequence of .read(1) -- .readline() is what .loads internally does (on a protocol-0 pickle, the default in Python 2, which is what you're using here). Unfortunately zipfile's imperfection means this particular sequence doesn't work, producing a spurious "end of file" (.read returning an empty string) right after the first read/readline pair.

Not sure offhand if this bug in Python's standard library is fixed in Python 2.7 -- I'm going to check.

Edit: just checked -- the bug is fixed in Python 2.7 rc1 (the release candidate that's currently the latest 2.7 version). I don't yet know whether it's fixed in the latest bug-fix release of 2.6 as well.

Edit again: the bug is still there in Python 2.6.5, the latest bug-fix release of Python 2.6 -- so if you can't upgrade to 2.7 and need better-behaving pseudofile objects from ZipFile.open, a backport of the 2.7 fix seems the only viable solution.

Note that it's not certain you do need better-behaving pseudofile objects; if you control the dump calls and can use the latest-and-greatest protocol, everything will be fine:

>>> zf = zipfile.ZipFile('zipped_pickle.zip', 'w', zipfile.ZIP_DEFLATED)
>>> zf.writestr('data.pkl', cPickle.dumps(some_data, -1))
>>> sd2 = cPickle.load(zf.open('data.pkl'))
>>>

it's only old crufty backwards-compatible "protocol 0" (the default) that requires proper pseudofile object behavior when mixing read and readline calls in the load (protocol 0 is also slower, and results in larger pickles, so it's definitely not recommended unless backwards compatibility with old Python versions, or the ascii-only nature of the pickles that 0 produces, are mandatory constraints in your application).

Alex Martelli 2010-06-09 15:04:47

ansaurus

tags:

views:

answers:

load a pickle file from a zipfile

related questions