views:

115

answers:

4

I'm using python 2.6.4 and discovered that I can't use gzip with subprocess the way I might hope. This illustrates the problem:

May 17 18:05:36> python
Python 2.6.4 (r264:75706, Mar 10 2010, 14:41:19)
[GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.

>>> import gzip
>>> import subprocess
>>> fh = gzip.open("tmp","wb")
>>> subprocess.Popen("echo HI", shell=True, stdout=fh).wait()
0
>>> fh.close()
>>>
[2]+  Stopped                 python
May 17 18:17:49> file tmp
tmp: data
May 17 18:17:53> less tmp
"tmp" may be a binary file.  See it anyway?
May 17 18:17:58> zcat tmp

zcat: tmp: not in gzip format

Here's what it looks like inside less

HI
^_<8B>^H^Hh<C0><F1>K^B<FF>tmp^@^C^@^@^@^@^@^@^@^@^@

which looks like it put in the stdout as text and then put in an empty gzip file. Indeed, if I remove the "Hi\n", then I get this:

May 17 18:22:34> file tmp
tmp: gzip compressed data, was "tmp", last modified: Mon May 17 18:17:12 2010, max compression

What is going on here?

UPDATE: This earlier question is asking the same thing: http://stackoverflow.com/questions/2732811/can-i-use-an-opened-gzip-file-with-popen-in-python

+4  A: 

You can't use file-likes with subprocess, only real files. The fileno() method of GzipFile returns the FD of the underlying file, so that's what the echo redirects to. The GzipFile then closes, writing an empty gzip file.

Ignacio Vazquez-Abrams
I guess I am piping through gzip then.
pythonic metaphor
A: 

You don't need to use subprocess to write to the gzip.GzipFile. Instead, write to it like any other file-like object. The result is automagically gzipped!

import gzip
fh = gzip.open("tmp","wb")
print fh.write('echo HI')
fh.close()
unutbu
A: 

I'm not totally sure why this isn't working (perhaps the output redirection is not calling python's write, which is what gzip works with?) but this works:

>>> fh.write(subprocess.Popen("echo Hi", shell=True, stdout=subprocess.PIPE).stdout.read())
Personman
A: 

just pipe that sucker

from subprocess import Popen,PIPE
GZ = Popen("gzip > outfile.gz",stdin=PIPE,shell=True)
P = Popen("echo HI",stdout=GZ.stdin,shell=True)
# these next three must be in order
P.wait()
GZ.stdin.close()
GZ.wait()
amwinter