views:

3948

answers:

9

What would be the best practice way to handle the caching of images using PHP.

The filename is currently stored in a MySQL database which is renamed to a GUID on upload, along with the original filename and alt tag.

When the image is put into the HTML pages it is done so using a url such as '/images/get/200x200/{guid}.jpg which is rewritten to a php script. This allows my designers to specify (roughly - the source image maybe smaller) the file size.

The php script then creates a hash of the size (200x200 in the url) and the GUID filename and if the file has been generated before (file with the name of the hash exists in TMP directory) sends the file from the application TMP directory. If the hashed filename does not exist, then it is created, written to disk and served up in the same manner,

Is this efficient as it could be? (It also supports watermarking the images and the watermarking settings are stored in the hash as well, but thats out of scope for this.)

A: 

Your approach seems quite reasonable - I would add that some mechanism should be put into place to check that the date the cached version was generated was after the last modified timestamp of the original (source) image file and if not regenerate the cached/resized version. This will ensure that if an image is changed by the designers the cache will be updated appropriately.

sgibbons
Good advice, but...If the source is changed by a new image being uploaded (through the admin interface) then a new GUID is generated and the cached filename will no longer match.Images shouldn't be uploaded with (s)FTP, but we all know about assumptions. ;)Will look at implementing anyway.
Chris Hawes
A: 

That sounds like a solid way to do it. The next step may be to go beyond PHP/MySQL.

Perhaps, tweak your headers:

If you're using PHP to send MIME types, you might also use 'Keep-alive' and 'Cache-control' headers to extend the life of your images on the server and take some of the load off of PHP/MySQL.

Also, consider apache plugin(s) for caching as well. Like mod_expires.

Oh, one more thing, how much control do you have over your server? Should we limit this conversation to just PHP/MySQL?

Pete Karl II
The application is hosted on a VPS, so I have root access.mod_expires is installed, so looks like a simple .htaccess tweak could reduce server load a little.Good tip, thanks.
Chris Hawes
A: 

phpThumb is a framework that generates resized images/thumbnails on the fly. It also implements caching and it's very easy to implement.

The code to resize an image is: <img src="/phpThumb.php?src=/path/to/image.jpg&w=200&h=200" alt="thumbnail"/> will give you a thumbnail of 200 x 200;

It also supports watermarking.

Check it out at: http://phpthumb.sourceforge.net/

+4  A: 

One note worth adding is to make sure you're code does not generate "unauthorized" sizes of these images.

So the following URL will create a 200x200 version of image 1234 if one doesn't already exist. I'd highly suggest you make sure that the requested URL contains image dimensions you support.

/images/get/200x200/1234.jpg

A malicious person could start requesting random URLs, always altering the height & width of the image. This would cause your server some serious issues b/c it will be sitting there, essentially under attack, generating images of sizes you do not support.

/images/get/0x1/1234.jpg
/images/get/0x2/1234.jpg
...
/images/get/0x9999999/1234.jpg
/images/get/1x1/1234.jpg
...
etc

Here's a random snip of code illustrating this:

<?php

    $pathOnDisk = getImageDiskPath($_SERVER['REQUEST_URI']);

    if(file_exists($pathOnDisk)) {
        // send header with image mime type 
        echo file_get_contents($pathOnDisk);
        exit;
    } else {
        $matches = array();
        $ok = preg_match(
            '/\/images\/get\/(\d+)x(\d+)\/(\w+)\.jpg/', 
            $_SERVER['REQUEST_URI'], $matches);

        if(! $ok) {
            // invalid url
            handleInvalidRequest();
        } else {
            list(, $width, $height, $guid) = $matches;

            // you should do this!
            if(isSupportedSize($width, $height)) {
                // size is supported. all good
                // generate the resized image, save it & output it
            } else {
                // invalid size requested!!!
                handleInvalidRequest();
            }
        }
    }

    // snip
    function handleInvalidRequest() {
        // do something w/ invalid request          
        // show a default graphic, log it etc
    }
?>
arin sarkissian
Using die() is horrible code. You basically bail out, dumping a single line of text to the client, without doing any error correction. If this example is used in production, replace die() calls with replacement images or helpful errors.
Dan Udey
i agree with you. its just sample code to get the just across
arin sarkissian
This is already implemented in my code (max and min image sizes with a placeholder image if anything goes wrong).Upvoted because thinking about PHP security is always good!
Chris Hawes
+15  A: 

I would do it in a different manner.

Problems: 1. Having PHP serve the files out is less efficient than it could be. 2. PHP has to check the existence of files every time an image is requested 3. Apache is far better at this than PHP will ever be.

There are a few solutions here.

You can use mod_rewrite on Apache. It's possible to use mod_rewrite to test to see if a file exists, and if so, serve that file instead. This bypasses PHP entirely, and makes things far faster. The real way to do this, though, would be to generate a specific URL schema that should always exist, and then redirect to PHP if not.

For example:

RewriteCond %{REQUEST_URI} ^/images/cached/
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-f
RewriteRule (.*) /images/generate.php?$1 [L]

So if a client requests /images/cached/<something> and that file doesn't exist already, Apache will redirect the request to /images/generate.php?/images/cached/<something>. This script can then generate the image, write it to the cache, and then send it to the client. In the future, the PHP script is never called except for new images.

Use caching. As another poster said, use things like mod_expires, Last-Modified headers, etc. to respond to conditional GET requests. If the client doesn't have to re-request images, page loads will speed dramatically, and load on the server will decrease.

For cases where you do have to send an image from PHP, you can use mod_xsendfile to do it with less overhead. See the excellent blog post from Arnold Daniels on the issue, but note that his example is for downloads. To serve images inline, take out the Content-Disposition header (the third header() call).

Hope this helps - more after my migraine clears up.

Dan Udey
Superb method thanks.Combined with CakePHP routes, a very elegant solution.
Chris Hawes
Excellent method and exactly what I need right now. I wish I could upvote this twice.
Pekka
Dude - NICE!!! +1 for the Heureka moment
Jens Roland
A: 

Seems great post, but my problem still remains unsolved. I dont have access to htaccess in my host provider, so no question of apache tweaking. Is there really a way to set cace-control header for images?

A: 

I've managed to do this simply using a redirect header in PHP:

if (!file_exists($filename)) {  

    // *** Insert code that generates image ***

    // Content type
    header('Content-type: image/jpeg'); 

    // Output
    readfile($filename); 

} else {
    // Redirect
    $host  = $_SERVER['HTTP_HOST'];
    $uri   = rtrim(dirname($_SERVER['PHP_SELF']), '/\\');
    $extra = $filename;
    header("Location: http://$host$uri/$extra");
}
+4  A: 

There is two typos in Dan Udey's rewrite example (and I can't comment on it), it should rather be :

RewriteCond %{REQUEST_URI} ^/images/cached/
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-f
RewriteRule (.*) /images/generate.php?$1 [L]

Regards.

Sensi
Thanks, I've updated the code in mine just in case they use that instead of your fixed version.
Dan Udey
A: 

Instead of keeping the file address in the db I prefer adding a random number to the file name whenever the user logs in. Something like this for user 1234: image/picture_1234.png?rnd=6534122341

If the user submits a new picture during the session I just refresh the random number.

GUID tackles the cache problem 100%. However it sort of makes it harder to keep track of the picture files. With this method there is a chance the user might see the same picture again at a future login. However the odds are low if you generate your random number from a billion numbers.

Haluk