views:

80

answers:

2

First, let's define some commonly confused terms:

deflate = compression_algorithm;
zlib = header + deflate + trailer;
gzip = header + deflate + trailer;

I'm looking for a library that will basically let me do the following:

if(method == "gzip"){
    Response.Filter = new CompressionLibrary.OutputStream(Response.Filter, CompressionLibrary.Formats.GZIP);
}
else if(method == "deflate"){
    Response.Filter = new CompressionLibrary.OutputStream(Response.Filter, CompressionLibrary.Formats.DEFLATE);
}
else if(method == "zlib"){
    Response.Filter = new CompressionLibrary.OutputStream(Response.Filter, CompressionLibrary.Formats.ZLIB);
}

I'm looking for a way to comparably test the 3 compression formats for use on the web. I would like for the deflate compression algorthims for each format to be the same exact implementation. I've already hacked away at zlib.net to force it to give me raw deflate on command (via an "undocumented feature")...however, adding the gzip header and trailer are little out of my league.

Anyone know of a .net library that does this?


Clarification:

HTTP 1.1's deflate compression format is actually the zlib compression format. Zlib is a wrapper around the deflate; it has a 2 byte header and a 4 byte trailer, always (when the compression methods and levels are identical).

Gzip uses the same compressed data format internally as zlib...which is deflate (raw deflate, not HTTP 1.1 deflate [which is zlib]). From my own preliminary testing, gzipped data is 11 out of 12 times larger than zlib.

deflate is a compression algorithm that is used to compress data. When there are no wrapper methods (e.g., headers or trailers) around deflated data, I call it "deflate" - perhaps I should have called it "raw deflate" instead.

I am doing an analysis of these compression methods and their support within web browsers and need to use a single compression method for all three types.

+1  A: 

Based on my reading of the standards documents and the work I've done with zlib, the .NET gzip and deflate implementations, and several other compression packages for .NET, I've determined:

1) "raw deflate" is always smaller than what you call "HTTP 1.1 deflate", which is always smaller than gzip. Assuming that you used the same library to generate all three. That is, for any particular compression library, deflate < zlib < gzip.

2) The differences in size are very small. The difference between deflate and zlib is usually just a few bytes. The difference between deflate and gzip is, at most, a few dozen bytes. This is true regardless of the file size.

3) Different deflate implementations have widely varying compression ratios and execution times. The zlib implementation, for example, gives better compression and faster execution than the .NET 3.5 implementation.

4) Interoperability between the different implementations is almost 100%. That is, a deflate (or gzip) file created by one library can be decompressed by any other library. I have heard of cases where this is not true, but I was unable to construct one.

5) It takes significantly longer to create gzip than it does zlib, because of the CRC calculation.

I do not know of a C# library that allows you to generate a zlib or gzip file, given the raw deflate data, but you should be able to construct them fairly easily if you study the standards documents.

I also do not know of any browser that supports "raw deflate". But then, I can't say that I've actually tried it. I've always used the "HTTP 1.1 deflate".

Jim Mischel
thanks, this confirms everything I've been trying to promote. And, you actually use raw deflate, not HTTP 1.1 deflate (unless you don't like IE users so much that you won't send them inflatable data). :-) See [this answer](http://stackoverflow.com/questions/1574168/deflate-compression-browser-compatibility-and-advantages-over-gzip) for details.
David Murdoch