views:

308

answers:

3

I need to write a python script that retrieves tar.Z files from an FTP server, and uncompress them on a windows machine. tar.Z, if I understood correctly is the result of a compress command in Unix.

Python doesn't seem to know how to handle these, it's not gz, nor bz2 or zip. Does anyone know a library that would handle these ?

Thanks in advance

A: 

Since you target a specific platform (Windows), the simplest solution may be to run gzip in a system call: http://www.gzip.org/#exe

Are there other requirements in your project that the decompression needs to be done in Python?

Oleg
Well, there's a whole stack of scripts written in python, so for clarity's sake I'd rather have everything in that language. But of course if there is no other way, I'm OK with whatever is available.I'll try gzip on windows, thanks for the suggestion.
gdebure
+1  A: 

If GZIP -- the application -- can handle it, you have two choices.

  1. Try the Python gzip library. It may work.

  2. Use subprocess Popen to run gzip for you.

It may be an InstallShield .Z file. You may want to use InstallShield to unpack it and extract the .TAR file. Again, you may be able to use subprocess Popen to process the file.

It may also be a "LZW compressed file". Look at this library, it may help.

http://www.chilkatsoft.com/compression-python.asp

S.Lott
A: 

A plain Python module that uncompresses is inexistant, AFAIK, but it's feasible to build one, given some knowledge:

  • the .Z format header specification
  • the .Z compression format

Almost all necessary information can be found the unarchiver CompressAlgorithm. Additional info from wikipedia for adaptive LZW and perhaps the compress man page.

Basically, you read the first three bytes (first two are magic bytes) to modify your algorithm, and then start reading and decompressing.

There's a lot of bit fiddling (.Z files begin having 9-bit tokens, up to 16-bit ones and then resetting the symbol table to the initial 256+2 values), which probably you'll deal with doing binary operations (&, <<= etc).

ΤΖΩΤΖΙΟΥ