views:

7343

answers:

6

Background

I am writing and using a very simple CGI-based (Perl) content management tool for two pro-bono websites. It provides the website administrator with HTML forms for events where they fill the fields (date, place, title, description, links, etc.) and save it. On that form I allow the administrator to upload an image related to the event. On the HTML page displaying the form, I am also showing a preview of the picture uploaded (HTML img tag).

The Problem

The problem happens when the administrator wants to change the picture. He would just have to hit the "browse" button, pick a new picture and press ok. And this works fine.

Once the image is uploaded, my back-end CGI handles the upload and reloads the form properly.

The problem is that the image shown does not gets refreshed. The old image is still shown, even though the database holds the right image. I have narrowed it down to the fact that the IMAGE IS CACHED in the web browser. If the administrator hits the Firefox/Explorer/Safari RELOAD button, everything gets refreshed fine and the new image just appears.

My Solution - Not Working

I am trying to control the cache by writing a HTTP Expires instruction with a date very far in the past.

Expires: Mon, 15 Sep 2003 1:00:00 GMT

Remember that I am on the administrative side and I don't really care if the pages takes a little longer to load because they are always expired.

But, this does not work either.

Notes

When uploading an image, its filename is not kept in the database. It is renamed as Image.jpg (to simply things out when using it). When replacing the existing image with a new one, the name doesn't change either. Just the content of the image file changes.

The webserver is provided by the hosting service/ISP. It uses Apache.

Question

Is there a way to force the web browser to NOT cache things from this page, not even images?

I am juggling with the option to actually "save the filename" with the database. This way, if the image is changed the src of the IMG tag will also change. However, this requires a lot of changes throughout the site and I rather not do it if I have a better solution. Also, this will still not work if the new image uploaded has the same name (say the image is photoshopped a bit and re-uploaded).

+11  A: 

Simple fix: Attach a random query string to the image:

<img src="foo.cgi?random=323527528432525.24234" alt="">

What the HTTP RFC says:

Cache-Control: no-cache

But that doesn't work that well :)

Armin Ronacher
A: 

With the potential for badly behaved transparent proxies in between you and the client, the only way to totally guarantee that images will not be cached is to give them a unique uri, something like tagging a timestamp on as a query string or as part of the path.

If that timestamp corresponds to the last update time of the image, then you can cache when you need to and serve the new image at just the right time.

+14  A: 

Armin Ronacher has the correct idea. The problem is random strings can collide. I would use:

<img src="picture.jpg?1222259157.415" alt="">

where "1222259157.415" is the current time on the server. (Note: I used python's time.time() to generate that)

epochwolf
One important addition is that you can never _force_ a browser to do anything. All you can do is make friendly suggestions. It's up to the browser and the user to actually follow those suggestions. A browser is free to ignore this, or a user could override the defaults.
Joel Coehoorn
Joel, you would have been better off adding that in your own answer.
epochwolf
+1  A: 

You may write a proxy script for serving images - that's a bit more of work though. Something likes this:

HTML:

<img src="image.php?img=imageFile.jpg&some-random-number-262376" />

Script:

// PHP
if( isset( $_GET['img'] ) && is_file( IMG_PATH . $_GET['img'] ) ) {

  // read contents
  $f = open( IMG_PATH . $_GET['img'] );
  $img = $f.read();
  $f.close();

  // no-cache headers - complete set
  // these copied from [php.net/header][1], tested myself - works
  header("Expires: Sat, 26 Jul 1997 05:00:00 GMT"); // Some time in the past
  header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT"); 
  header("Cache-Control: no-store, no-cache, must-revalidate"); 
  header("Cache-Control: post-check=0, pre-check=0", false); 
  header("Pragma: no-cache"); 

  // image related headers
  header('Accept-Ranges: bytes');
  header('Content-Length: '.strlen( $img )); // How many bytes we're going to send
  header('Content-Type: image/jpeg'); // or image/png etc

  // actual image
  echo $img;
  exit();
}

Actually either no-cache headers or random number at image src should be sufficient, but since we want to be bullet proof..

Yours is a good solution, except that Pragma is not a response header.
Piskvor
+2  A: 

When uploading an image, its filename is not kept in the database. It is renamed as Image.jpg (to simply things out when using it).

Change this, and you've fixed your problem. I use timestamps, as with the solutions proposed above: Image-<timestamp>.jpg

Presumably, whatever problems you're avoiding by keeping the same filename for the image can be overcome, but you don't say what they are.

AmbroseChapel
+1  A: 

I assume original question regards images stored with some text info. So, if you have access to a text context when generating src=... url, consider store/use CRC32 of image bytes instead of meaningless random or time stamp. Then, if the page with plenty of images is displaying, only updated images will be reloaded. Eventually, if CRC storing is impossible, it can be computed and appended to the url at runtime.

jbw