tags:

views:

132

answers:

1

Hey,

I'm using boost::iostreams::gzip_decompressor with boost::iostreams::filterimg_streambuf to read gzip files.

some of my files have what zcat calls trailing trash

% zcat somefile
data
data
data

gzip: somefile: decompression OK, trailing garbage ignored

What I want is for boost gzip to behave the same way.

When trying to decompress the same file with boost, using the following function, I get a gzip_error exception (bad_header)

static int read_gzip(fs::path f, stringstream& s)
{
        ifstream file(f.string().c_str(), ios_base::in | ios_base::binary);
        io::filtering_streambuf<io::input> in;

        in.push(io::gzip_decompressor());
        in.push(file);

        try { 
            io::copy(in, s);
            return 1;
        }
        catch (io::gzip_error& e) {
            fprintf(stderr, "(read_gzip) io::copy exception %s %s (%d)\n", f.string().c_str(), e.what(), e.error());
        }

        return 0;
}

when it throws an exception, the stingstream remains empty.

as a workaround, I can read the data byte by byte using something like this:

static int read_gzip(fs::path f, string& s)
{
        ifstream file(f.string().c_str(), ios_base::in | ios_base::binary);
        io::filtering_streambuf<io::input> in;
        char buf[1];

        in.push(io::gzip_decompressor());
        in.push(file);

        try { 
            std::streamsize result;
            while ((result = io::read(in, buf, 1)) != -1) {
                s.append(buf, 1);
            }

            return 1;
        }
        catch (io::gzip_error& e) {
            fprintf(stderr, "(read_gzip) io::copy exception %s %s (%d)\n", f.string().c_str(), e.what(), e.error());
        }

        return 0;
} 

but it seems rather inefficient.

What's the proper way to read gzipped files with trailing garbage?

Thanks.

A: 

Ok, What finally worked for me is to use zlib with different window_bits so it can decompress gzip.

#include <zlib.h> # for MAX_WBITS

static int read_gzip(fs::path f, stringstream& s)
{
        ifstream file(f.string().c_str(), ios_base::in | ios_base::binary);
        io::filtering_streambuf<io::input> in;

        io::zlib_params p;
        p.window_bits = 16 + MAX_WBITS;

        in.push(io::zlib_decompressor(p));
        in.push(file);

        try { 
            io::copy(in, s);
            return 1;
        }
        catch (io::zlib_error & e) {
            fprintf(stderr, "(read_gzip) io::copy exception %s :: %s (%d)", f.string().c_str(), e.what(), e.error());
        }

     return 0;
}

it decompresses both regular gzip files and gzip files with trailing trash without throwing an exception.

miedwar