views:

80

answers:

5

Hi, I have file of 3T in .gz format. I want to read it line-by-line in a C++ program without actually decompressing. Can any one post a simple example of doing it?

A: 

Chilkat (http://www.chilkatsoft.com/) has libraries to read compressed files from a C++, .Net, VB, ... application.

Patrick
+3  A: 

You most probably will have to use ZLib's deflate, example is available from their site

Alternatively you may have a look at BOOST C++ wrapper

The example from BOOST page (decompresses data from a file and writes it to standard output)

#include <fstream>
#include <iostream>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/zlib.hpp>

int main() 
{
    using namespace std;

    ifstream file("hello.z", ios_base::in | ios_base::binary);
    filtering_streambuf<input> in;
    in.push(zlib_decompressor());
    in.push(file);
    boost::iostreams::copy(in, cout);
}
bobah
A: 

You can't do that, because *.gz doesn't have "lines".

If compressed data has newlines, you'll have to decompress it. You don't have to decompress all data at once, you know, you can do it in chunks, and send strings back to main program when you encounter newline characters. *.gz can be decompressed using zlib.

SigTerm
A: 

The zlib library supports decompressing files in memory in blocks, so you don't have to decompress the entire file in order to process it.

Amnon
+1  A: 

For something that is going to be used regularly, you probably want to use one of the previous suggestions. Alternatively, you can do

gzcat file.gz | yourprogram

and have yourprogram read from cin. This will decompress parts of the file in memory as it is needed, and send the uncompressed output to yourprogram.

KeithB