views:

84

answers:

4

I have found utilities that can compress html, javascript, and css files individually. However, lets say I have a file called homepage.php with php at the top of the page, css and javascript in the header, html in the body along with more php code and perhaps some more javascript. Sometimes, the chunks with javascript may also have php in it (for transferring php values to javascript). Are there any utilities that can handle compressing such pages?

Going through every file, breaking it apart manually, and compressing everything individually can get very tedious. Even something that can just compress the html and css in the files and ignore the javascript and php would be extremely useful. Most of my javascript is in external js files anyway.

+2  A: 

If the idea behind this question is to use less bandwidth, then I suggest you use an apache output filter (assuming that you are using apache as webserver anyway). Just filter all PHP scripts and static HTML pages with ie. mod_deflate, and you'll use less bandwidth.

wimvds
A: 

Normally, compression of html, js and css files is done to reduce bandwidth usage as the files are sent to the user's browser, resulting in faster page loads.

Is this your purpose? Why would you want to send php to the browser?

If your purpose is not faster page loads but saving disk space, you could just use gzip or something. But it seems almost pointless.

LarsH
I am not sending php to the browser. I want the php to remain in tact in the file, but the html and css parts to be compressed. I guess minified is the better term here.. basically, whitespaces and commenting being removed. I will use gzip, but I want the file as small as possible to begin with.
@user396404: understood. Really though, I doubt it's worth the effort. We did minification on our js and css, and got very little speed improvement (about 2%) relative to the latency of fetching the page. If you're only planning to do it on one page (the home page), and you really are intent on doing it, I would do it the tedious manual way that you mentioned. If there are many such pages, use @Gordon's method.
LarsH
+1  A: 

Even if there is not a tool that would let you do all minifaction at once, you could automate this easily with DOM. Just parse the page and find all <script> and <style> elements. Then run their content through third party libs like JsMin or CSSMin and replace the nodes with the output.

However, you could also just enable gzip compression in your webserver, which should be more than enough compression for most websites. Or did you run into any traffic limits lately?

Gordon
@Gordon: you're saying to do this transformation on the server, not the browser, right? Any suggestion as to HTML DOM parsing tools to use on the server?
LarsH
@Gordon: also, your suggested method may mess up the PHP that he has embedded in the javascript sections. Unless you just pass <script> element content through unmodified. But somehow the DOM parser would have to know how to skip over the PHP code at the top of the page and throughout.
LarsH
@LarsH it doesnt make sense to do in the browser. For HTML parsers see http://stackoverflow.com/questions/3577641/best-methods-to-parse-html/3577662#3577662 - if there is PHP in the templates, first process the template, then parse the resulting HTML (it wont have PHP in it then anymore) and minify.
Gordon
@Gordon, then you're saying he has to perform this process every time the page is served, rather than once per file.
LarsH
@LarsH No, I am not. I am saying you can do it with DOM. I am not saying anything about a concrete implementation. Whether you do that on every request or persist/cache the processed file is up to you and your application scenario.
Gordon
A: 

Thanks for all the feedback guys. I'll be using apache to compress the files but wanted to make something to remove all the line breaks in css and html segments of the page before hand. I couldn't find anything to do that while leaving the php and javascript untouched in a file, so I made my own script to do it.

The following code can definitely be improved and is very, very raw. There's a lot of spots where I can make it more efficient, but this is just a test of an idea. However, it's working well enough to use. Just save it to a php file, and set $file_name to the name of your file.

<?

function parsefile($file)
{

    $contents = file_get_contents($file);

    $contents = preg_replace('/<!--(.|\s)*?-->/', '', $contents);
    $contents = str_replace('<?', ' <? ', $contents);
    $contents = str_replace('?>', ' ?> ', $contents);
    $contents = str_replace('<script', ' <script', $contents);
    $contents = str_replace('script>', 'script> ', $contents);

    $filtered = '';

    $length = strlen($contents);

    $ignore = Array();
    $html = Array();

    for($i = 0;$i <= $length;$i++)
    {

        if(substr($contents, $i, 2) == '<?')
        {

            $end = strpos($contents, '?>', $i) + 2;

            array_push($ignore, Array('php', $i, $end));
            $i = $end;

        }
        else if(strtolower(substr($contents, $i, 7)) == '<script')
        {

            $end = strpos($contents, '</script>', $i) + 9;

            array_push($ignore, Array('js', $i, $end));
            $i = $end;

        }

    }

    $ignore_c = count($ignore) - 1;

    for($i = 0;$i <= $ignore_c;$i++)
    {

        $start = $ignore[$i][2];

        if($start < $length)
        {

            array_push($html, Array('html', $start+1, $ignore[$i+1][1]-1));

        }

    }

    function cmp($a, $b)
    {
    if ($a[1] == $b[1]) {
        return 0;
    }
    return ($a[1] < $b[1]) ? -1 : 1;
    }

    $parts = array_merge($ignore, $html);

    usort($parts, "cmp");

    foreach($parts as $k => $v)
    {

        $cont = substr($contents, $parts[$k][1], ($parts[$k][2]-$parts[$k][1]));

        if($parts[$k][0] == 'html')
        {

            $cont = str_replace(Array("\n", "\t", "  ", "   ", "    "), " ", $cont);

        }

        $filtered .= $cont;

    }

    return $filtered;

}

$file_name = '../main.php';
$filtered = parsefile($file_name);

echo '<textarea style="width:700px;height:600px">' . file_get_contents($file_name) . '</textarea>';
echo '&nbsp;&nbsp;<textarea style="width:700px;height:600px">' . $filtered . '</textarea>';

?>

With some tinkering, this code can be modified to iterate through all the files in a directory and save them to another directory as minified versions.