ansaurus

Question

Answer 1

+2 A:

They just compressing the data using zlib or deflate algorithms , but does not provide the output for some specific file format. This means that if you store the stream as-is to the hard drive most probably you will not be able to open it using some application (gzip or winrar) because file headers (magic number, etc ) are not included in stream an you should write them yourself.

andreasmk2 2008-09-16 08:28:49

Answer 2

A:

I agree with andreas. You probably won't be able to open the file in an external tool, but if that tool expects a stream you might be able to use it. You would also be able to deflate the file back using the same compression class.

configurator 2008-09-16 08:31:52

Answer 3

+1 A:

gzip is deflate + some header/footer data, like a checksum and length, etc. So they're not compatible in the sense that one method can use a stream from the other, but they employ the same compression algorithm.

Lasse V. Karlsen 2008-09-16 08:34:48

Answer 4

+3 A:

From MSDN about System.IO.Compression.GZipStream:

This class represents the gzip data format, which uses an industry standard algorithm for lossless file compression and decompression.

From the zlib FAQ:

The gz* functions in zlib on the other hand use the gzip format.

So zlib and GZipStream should be interoperable, but only if you use the zlib functions for handling the gzip-format.

System.IO.Compression.Deflate and zlib are reportedly not interoperable.

If you need to handle zip files (you probably don't, but someone else might need this) you need to use SharpZipLib or another third-party library.

Rasmus Faber 2008-09-16 09:18:21

zip files are not the same as zlib-compressed files (the compression algorithms may be the same, but the headers are not)

Ben Collins 2008-10-09 15:26:19

You are right. I will edit my response.

Rasmus Faber 2008-10-12 08:27:20

re: "reportedly not interoperable" regarding zlib and DeflateStream. They are ACTUALLY not interoperable. There are three IETF RFCs covering this space: 1950 for ZLIB, 1951 for DEFLATE, and 1952 for GZIP. Deflate is the compression algorithm. ZLIB and GZIP are distinct formats, which define metadata, aka "headers", that apply to the compressed stream. The zlib library implements both ZLIB and GZIP. To make it interesting, both ZLIB and GZIP can use DEFLATE as the compression mechanism. The DeflateStream class produces a bare, headerless stream. It's no wonder we are all confused.

Cheeso 2009-05-16 14:06:36

Answer 5

+5 A:

I've used GZipStream to compress the output from the .NET XmlSerializer and it has worked perfectly fine to decompress the result with gunzip (in cygwin), winzip and another GZipStream.

For reference, here's what I did in code:

FileStream fs = new FileStream(filename, FileMode.Create, FileAccess.Write);
using (GZipStream gzStream = new GZipStream(fs, CompressionMode.Compress))
{
  XmlSerializer serializer = new XmlSerializer(typeof(MyDataType));
  serializer.Serialize(gzStream, myData);
}

Then, to decompress in c#

FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.Read);
using (Stream input = new GZipStream(fs, CompressionMode.Decompress))
{
   XmlSerializer serializer = new XmlSerializer(typeof(MyDataType));
   myData = (MyDataType) serializer.Deserialize(input);
}

Using the 'file' utility in cygwin reveals that there is indeed a difference between the same file compressed with GZipStream and with GNU GZip (probably header information as others has stated in this thread). This difference, however, seems to not matter in practice.

Isak Savo 2008-09-16 14:20:42

works like charm!The big dataset I'm using for performance testing has been compressed from 55MB to just 7.5MB, without noticeable performance loss.P.S. If the "file" is renamed to "file.gz", it becomes a perfectly valid archive file. You can even modify its content using any archive tool, and it will remain deserializeable using your method.

Soonts 2010-01-30 15:46:52

Answer 6

+1 A:

DotNetZip includes a DeflateStream, a ZlibStream, and a GZipStream, to handle RFC 1950, 1951, and 1952. The all use the DEFLATE Algorithm but the framing and header bytes are different for each one.

As an advantage, the streams in DotNetZip do not exhibit the anomaly of expanding data size under compression, reported against the built-in streams. Also, there is no built-in ZlibStream, whereas DotNetZip gives you that, for good interop with zlib.

Cheeso 2009-03-06 16:10:58

Answer 7

A:

Can a gzip stream be stored in an NVarchar Sql Server field without loss of information?

John 2009-11-30 10:59:28

Answer 8

+1 A:

I ran into this issue with Git objects. In that particular case, they store the objects as deflated blobs with a Zlib header, which is documented in RFC 1950. You can make a compatible blob by making a file that contains:

Two header bytes (CMF and FLG from RFC 1950) with the values 0x78 0x01
- CM = 8 = deflate
- CINFO = 7 = 32Kb window
- FCHECK = 1 = checksum bits for this header
The output of the C# DeflateStream
An Adler32 checksum of the input data to the DeflateStream, big-endian format (MSB first)

I made my own Adler implementation

public class Adler32Computer
{
    private int a = 1;
    private int b = 0;

    public int Checksum
    {
        get
        {
            return ((b * 65536) + a);
        }
    }

    private static readonly int Modulus = 65521;

    public void Update(byte[] data, int offset, int length)
    {
        for (int counter = 0; counter < length; ++counter)
        {
            a = (a + (data[offset + counter])) % Modulus;
            b = (b + a) % Modulus;
        }
    }
}

And that was pretty much it.

Blake Ramsdell 2010-02-25 01:46:52

ansaurus

tags:

views:

answers:

Zlib-compatible compression streams?

related questions