views:

693

answers:

1

Let's say there's a.gz, and b.gz.

$ gzip_merge a.gz b.gz -output c.gz

I'd like to have this program. Of course,

$ cat a.gz b.gz > c.gz

doesn't work. Because the final DEFLATE block of a.gz has BFINAL, and the GZIP header of b.gz. (Refer to RFC1951, RFC1952) But if you unset BFINAL, throw away the second GZIP header and walk through the byte boundaries of the second gzip file, you can merge it.

In fact, I thought of writing an open source program for this matter, but didn't know how to publish it. So I asked the Joel to be my program manager, and I walked him through my explanation and defense, he finally understood what I wanted to do, but said he was too busy. :(

Of course, I could write one myself and try my way to publish it. But I can't do this alone because my day work belongs to the property of my employer.

Is there any volunteers? We could work as programmer(me), publisher(you) or programmer(you), publisher(me). All I need is some credit. I once implemented a Universal Decompressor Virtual Machine described in RFC3320. So I know this is feasible.

OR, you could point me to THAT program. It would be very useful for managing log files like merging 365 (day) gzipped log files to one. ;)

Thanks.

+8  A: 

Of course, cat a.gz b.gz > c.gz doesn't work.

Actually, it works just fine. I just tested it. It's even documented (sort of) in the gzip man page.

   Multiple  compressed  files  can  be concatenated. In this case, gunzip
   will extract all members at once. For example:

         gzip -c file1  > foo.gz
         gzip -c file2 >> foo.gz

   Then

         gunzip -c foo

   is equivalent to

         cat file1 file2
Glomek
Oh, it works like a charm! Thank you! I suppose it all works with a Perl implementation like PerlIO::gzip.
yogman
For creating the files, I would expect no problem. For reading them, in the worst case you could use a loop or shell out to zcat/gunzip.
Glomek