views:

101

answers:

4

This is the script I have written for gzipping content on my site, which is located in 'gzip.php'. The way I use it is that on pages where I want to enable gzipping I include the file at the top and at the bottom I call the output function like this:

print_gzipped_page('javascript')

If the file is a css-file I use 'css' as the $type-argument and if its a php file I call the function without declaring any arguments. The script works fine in all browsers except Opera which gives an error saying it could not decode the page due to damaged data. Can anyone tell me what I have done wrong?

<?php
function print_gzipped_page($type = false) {
    if(headers_sent()){
        $encoding = false;
    }
    elseif( strpos($_SERVER['HTTP_ACCEPT_ENCODING'], 'x-gzip') !== false ){
        $encoding = 'x-gzip';
    }
    elseif( strpos($_SERVER['HTTP_ACCEPT_ENCODING'],'gzip') !== false ){
        $encoding = 'gzip';
    }
    else{
        $encoding = false;
    }
    if ($type!=false) {
        $type_header_array = array("css" => "Content-Type: text/css", "javascript" => "Content-Type: application/x-javascript");
        $type_header = $type_header_array[$type];
    }

    $contents = ob_get_contents();
    ob_end_clean();
    $etag = '"' .  md5($contents) . '"';
    $etag_header = 'Etag: ' . $etag;
    header($etag_header);

    if ($type!=false) {
        header($type_header);
    }

    if (isset($_SERVER['HTTP_IF_NONE_MATCH']) and $_SERVER['HTTP_IF_NONE_MATCH']==$etag) {
        header("HTTP/1.1 304 Not Modified");
        exit();
    }

    if($encoding){
        header('Content-Encoding: '.$encoding);
        print("\x1f\x8b\x08\x00\x00\x00\x00\x00");
        $size = strlen($contents);
        $contents = gzcompress($contents, 9);
        $contents = substr($contents, 0, $size);
    }

    echo $contents;
    exit();
}

ob_start();
ob_implicit_flush(0);
?>

Additional info: The script works if the length of the document being compressed is only 10-15 characters.

Thanks for the help, corrected version:

<?php
function print_gzipped_page($type = false) {
    if(headers_sent()){
        $encoding = false;
    }
    elseif( strpos($_SERVER['HTTP_ACCEPT_ENCODING'], 'x-gzip') !== false ){
        $encoding = 'x-gzip';
    }
    elseif( strpos($_SERVER['HTTP_ACCEPT_ENCODING'],'gzip') !== false ){
        $encoding = 'gzip';
    }
    else{
        $encoding = false;
    }
    if ($type!=false) {
        $type_header_array = array("css" => "Content-Type: text/css", "javascript" => "Content-Type: application/x-javascript");
        $type_header = $type_header_array[$type];
        header($type_header);
    }

    $contents = ob_get_contents();
    ob_end_clean();


    if($encoding){
        header('Content-Encoding: ' . $encoding);
        $contents = gzencode($contents, 9);
    }

    $etag = '"' .  md5($contents) . '"';
    $etag_header = 'Etag: ' . $etag;
    header($etag_header);

    if (isset($_SERVER['HTTP_IF_NONE_MATCH']) and $_SERVER['HTTP_IF_NONE_MATCH']==$etag) {
        header("HTTP/1.1 304 Not Modified");
        exit();
    }

    $length = strlen($contents);
    header('Content-Length: ' . $length);
    echo $contents;
    exit();
}

ob_start();
ob_implicit_flush(0);
?>
+1  A: 

Two things stand out:

1) you don't seem to be setting the Content-Length header to the size of the compressed data. (Maybe I've overlooked it.) If you don't set this a browser might think you've finished sending data too early.

2) you are doing a substr of the compressed $content with the uncompressed $size. Some browsers will stop decompressing when the internal structure has an EOF marker but other browsers (Opera?) may attempt to decompress the entire downloaded buffer. That would definitely give you a 'damaged data' error. You might not be seeing this problem with small buffers because the amount of overhead and the amount of compression might exactly match.

bgiles
Thank you, problem solved. I removed the substr() on the compressed data and set a Content-Length header with the length of the compressed document.
RadiantHeart
A: 

hmmm, it`s strange. In my case , if i use the above code, then when i "view source" of a html page, the last html tag is missing. This is happening in Opera, FF, Safari and so on. Any idea ?

kingbullet
Well, you should not use the code posted above since it has several flaws, thats the reason I posted it here. First of all it performes a substr on the zipped content based on the length of the content before zipping it. Also i have found that it is better to use gzencode() than gzcompress(), plus that there are no Content-Length header beeing sent. I have updated my post with corrected version.
RadiantHeart
A: 

Thanks for help :) Code works perfect. I have a question, how you use this code with javascript or css files. I mean, how you gzip js/css files ?

kingbullet
Two options i think. If you have the necessary access to the server you can configure Apache to parse php in .js and .css files as it does with .php-files. If you dont you can rename a css-file to a .php-file and just send header("Content-Type: text/css") at the top of it. The same goes for javascript files, add header("Content-Type: application/x-javascript") at the top.
RadiantHeart
+3  A: 

This approach is a bit too clumsy. Rather make use of ob_gzhandler. It will automatically GZIP the content which the client supports it and set the necessary headers.

ob_start('ob_gzhandler');
readfile($path);
BalusC
Exactly... why reinvent the wheel?
R. Bemrose
Having said that, do you need `ob_end_flush()` at the end, or does it call that automatically?
R. Bemrose
@OMG: The PHP doc is indeed not explicit about this, but in my case it does the flush and close in all circumstances, so I take that it does that automatically.
BalusC
@BalusC: Why is it clumsy? The reason I did not want to use ob_gzhandler is that i wanted to use gzipping while still beeing able to send 304 - Not Modified header. I wanted to buffer the contents, make a hash of it and only output it if it has changed. You cant do that with ob_gzhandler, it creates a bug with firefox. Read this: http://www.php.net/manual/en/function.ob-gzhandler.php#97385. I asked a question about what was wrong with my code, if you have suggestions I would apreciate it.
RadiantHeart
I would just check the `ETag` (usually composed of the file name, size and lastmodified timestamp) for a 304.
BalusC
I think you have misunderstood. With my code I generate a ETAG, not for static content like an image or a javascript-file, but for dynamic content. A page extracting information from a database does not have to change in order for the dynamic content it outputs to change. With my approach a page wount output content if, say, no new information has been inserted into the database.
RadiantHeart
Ah yes, I see. I however usually never cache dynamic HTML pages because that makes no sense and the cost on the server side is only higher. If you need to load and compress it **everytime** anyway, why not just let it go through the response? Caching usually makes sense for static files only.
BalusC