Are System.IO.Compression.GZipStream or System.IO.Compression.Deflate compatible with zlib compression?
They just compressing the data using zlib or deflate algorithms , but does not provide the output for some specific file format. This means that if you store the stream as-is to the hard drive most probably you will not be able to open it using some application (gzip or winrar) because file headers (magic number, etc ) are not included in stream an you should write them yourself.
I agree with andreas. You probably won't be able to open the file in an external tool, but if that tool expects a stream you might be able to use it. You would also be able to deflate the file back using the same compression class.
gzip is deflate + some header/footer data, like a checksum and length, etc. So they're not compatible in the sense that one method can use a stream from the other, but they employ the same compression algorithm.
From MSDN about System.IO.Compression.GZipStream:
This class represents the gzip data format, which uses an industry standard algorithm for lossless file compression and decompression.
From the zlib FAQ:
The gz* functions in zlib on the other hand use the gzip format.
So zlib and GZipStream should be interoperable, but only if you use the zlib functions for handling the gzip-format.
System.IO.Compression.Deflate and zlib are reportedly not interoperable.
If you need to handle zip files (you probably don't, but someone else might need this) you need to use SharpZipLib or another third-party library.
I've used GZipStream to compress the output from the .NET XmlSerializer and it has worked perfectly fine to decompress the result with gunzip (in cygwin), winzip and another GZipStream.
For reference, here's what I did in code:
FileStream fs = new FileStream(filename, FileMode.Create, FileAccess.Write);
using (GZipStream gzStream = new GZipStream(fs, CompressionMode.Compress))
{
XmlSerializer serializer = new XmlSerializer(typeof(MyDataType));
serializer.Serialize(gzStream, myData);
}
Then, to decompress in c#
FileStream fs = new FileStream(filename, FileMode.Open, FileAccess.Read);
using (Stream input = new GZipStream(fs, CompressionMode.Decompress))
{
XmlSerializer serializer = new XmlSerializer(typeof(MyDataType));
myData = (MyDataType) serializer.Deserialize(input);
}
Using the 'file' utility in cygwin reveals that there is indeed a difference between the same file compressed with GZipStream and with GNU GZip (probably header information as others has stated in this thread). This difference, however, seems to not matter in practice.
DotNetZip includes a DeflateStream, a ZlibStream, and a GZipStream, to handle RFC 1950, 1951, and 1952. The all use the DEFLATE Algorithm but the framing and header bytes are different for each one.
As an advantage, the streams in DotNetZip do not exhibit the anomaly of expanding data size under compression, reported against the built-in streams. Also, there is no built-in ZlibStream, whereas DotNetZip gives you that, for good interop with zlib.
Can a gzip stream be stored in an NVarchar Sql Server field without loss of information?
I ran into this issue with Git objects. In that particular case, they store the objects as deflated blobs with a Zlib header, which is documented in RFC 1950. You can make a compatible blob by making a file that contains:
- Two header bytes (CMF and FLG from RFC 1950) with the values
0x78 0x01
CM
= 8 = deflateCINFO
= 7 = 32Kb windowFCHECK
= 1 = checksum bits for this header
- The output of the C#
DeflateStream
- An Adler32 checksum of the input data to the
DeflateStream
, big-endian format (MSB first)
I made my own Adler implementation
public class Adler32Computer
{
private int a = 1;
private int b = 0;
public int Checksum
{
get
{
return ((b * 65536) + a);
}
}
private static readonly int Modulus = 65521;
public void Update(byte[] data, int offset, int length)
{
for (int counter = 0; counter < length; ++counter)
{
a = (a + (data[offset + counter])) % Modulus;
b = (b + a) % Modulus;
}
}
}
And that was pretty much it.