views:

248

answers:

3

You might know that HTML related file formats are compressed using GZip compression, server side, (by mod_gzip on Apache servers), and are decompressed by compatible browsers. ("content encoding")

Does this only work for HTML/XML files? Lets say my PHP/Perl file generates some simple comma delimited data, and sends that to the browser, will it be encoded by default?

What about platforms like Silverlight or Flash, when they download such data will it be compressed/decompressed by the browser/runtime automatically? Is there any way to test this?

+6  A: 

Does this only work for HTML/XML files?

No : it is quite often used for CSS and JS files, for instance -- as those are amongst the biggest thing that websites are made of (except images), because of JS frameworks and full-JS applications, it represents a huge gain!

Actually, any text-based format can be compressed quite well (on the opposite, images can not, for instance, as they are generally already compressed) ; sometimes, JSON data returned from Ajax-requests are compressed too -- it's text data, afterall ;-)

Lets say my PHP/Perl file generates some simple comma delimited data, and sends that to the browser, will it be encoded by default?

It's a matter of configuration : if you configured your server to compress that kind of content, it'll probably be compressed :-)
(If the browser says it accepts gzip-encoded data)


Here's a sample of configuration for Apache 2 (using mod_deflate) that I use on my blog :

<IfModule mod_deflate.c>
    AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css text/javascript application/javascript application/x-javascript application/xml
</IfModule>

Here, I want html/xml/css/JS te be compressed.

And here is the same thing, plus/minus a few configuration options I used once, under Apache 1 (mod_gzip) :

<IfModule mod_gzip.c>
    mod_gzip_on                   Yes
    mod_gzip_can_negotiate        Yes

    mod_gzip_minimum_file_size    256
    mod_gzip_maximum_file_size    500000

    mod_gzip_dechunk              Yes

    mod_gzip_item_include         file       \.css$
    mod_gzip_item_include         file       \.html$
    mod_gzip_item_include         file       \.txt$
    mod_gzip_item_include         file       \.js$
    mod_gzip_item_include         mime       text/html

    mod_gzip_item_exclude         mime       ^image/
</IfModule>

Things that can be noticed here are that I don't want too small (the gain wouldn't be quite important) or too big (would eat too much CPU to compress) files to be compressed ; and I want css/html/txt/js files to be compressed, but not images.


If you want you comma-separated data to be compressed the same way, you'll have to add either it's content-type or it's extension to the configuration of your webserver, to activate gzip-compression for it.

Is there any way to test this?

For any content returned directly to the browser, Firefox's extensions Firebug or LiveHTTPHeaders are a must-have.

For content that doesn't go through the standard communication way of the browser, it might be harder ; in the end, you may have to end up using something like Wireshark to "sniff" what is really going through the pipes... Good luck with that!

What about platforms like Silverlight or Flash, when they download such data will it be compressed/decompressed by the browser/runtime automatically?

To answer your question about Silverlight and Flash, if they send an Accept header indicating they support compressed content, Apache will use mod_deflate or mod_gzip. If they don’t support compression they won’t send the header. It will “just work.” – Nate

Pascal MARTIN
Just to clarify for future readers: Any type of HTTP data can be compressed, period. It's just a bad idea for data that is already compressed, such as images.
Sean Reilly
@Sean > exactly ; I wonder if one would gain anything by re-compressing images, btw... Never dared testing it ^^ (I guess the gain would be really minimalistic, if not null, and it would eat quite some CPU for almost nothing... )
Pascal MARTIN
Well, I have seen uncompressed BMP images served up...
Tim Sylvester
Ergh :-( Seeing bmp on the net is so sad :-( (but still so funny, seeing those load slooooowwwly... )
Pascal MARTIN
Would someone answer my question about platforms like Silverlight or Flash???
Jenko
To answer your question about Silverlight and Flash, if they send an Accept header indicating they support compressed content, Apache will use mod_deflate or mod_gzip. If they don’t support compression they won’t send the header. It will “just work.”
Nate
+4  A: 

I actually think Apache’s mod_deflate is more common than mod_gzip, simply because it’s built-in and does the same thing. Look at the documentation for mod_deflate (linked above) and you’ll see that it’s easy to specify which file types to compress, based on their MIME types. Generally it’s worth compressing HTML, CSS, XML and JavaScript. Images are already compressed, so they don’t benefit from compression.

Nate
+4  A: 

The browser sends an "Accept-Encoding" header with the types of compression that it knows how to understand. The server looks at this, along with the user-agent and decides how to encode the result. Some browsers lie about what they can understand, so this is more complex than just searching for "deflate" in the header.

Technically, any HTTP/2xx response with content can be content-encoded using any of the valid content encodings (gzip, zlib, deflate, etc.), but in practice it's wasteful to apply compression to common image types because it actually makes them larger.

You can definitely compress the response from dynamic PHP pages. The simplest method is to add:

<?php ob_start("ob_gzhandler"); ?>

to the start of every PHP page. It's better to set it up through the PHP configuration, of course.

There are many test pages, easily found with Google:

Tim Sylvester