tags:

views:

38

answers:

4

I'm writing a script that can determine if a page is compressed or not, and I've been doing a bit of research and cannot figure out how to determine if a page is compressed. I'd assume that a page compressed would have something in the headers to say that it is a compressed file. Like Content-Type or something.

Any help is appreciated.

A: 

Compressed page will have Content-Encoding header with compression algorithm.

For example:

Content-Encoding: gzip

Māris Kiseļovs
+2  A: 

It's actually Content-encoding. Depending on the type of compression, this may be gzip (or x-gzip), deflate or compress in case of compressed data.

To cite wikipedia:

The “Content-Encoding”/"Accept-Encoding" and "Transfer-Encoding"/"TE" headers in HTTP/1.1 allow clients to optionally receive compressed HTTP responses and (less commonly) to send compressed requests. The specification for HTTP/1.1 (RFC 2616) specifies three compression methods: “gzip” (RFC 1952; the content wrapped in a gzip stream), “deflate” (RFC 1950; the content wrapped in a zlib-formatted stream), and "compress" (explained in RFC 2616 section 3.5 as 'The encoding format produced by the common UNIX file compression program "compress". This format is an adaptive Lempel-Ziv-Welch coding (LZW).'). Many client libraries, browsers, and server platforms (including Apache and Microsoft IIS) support gzip.

Artefacto
A: 

Do a http request with accepting gzip, and then analyse received headers, and look for Content-Encoding: gzip

killer_PL
A: 

That's the web-browser which can see whether the page is compressed or not. As a web server Apache locates Accept-Encoding: gzip,deflate in HTTP request header. If it is present it compresses PHP script's HTML response and does compression accordingly.

Ref: http://www.websiteoptimization.com/speed/tweak/compress/

Ankit Jain