views:

28

answers:

1

A web app I'm developing uses lots of asynchronously loaded images that are often modified over time while their URLs are preserved. There are several problems with this:

  1. If I do not provide the images with caching explicitly disabled in HTTP headers, the user will often receive an out of date image version, but doing so substantially increases server load.
    How can I take the cache control away from the browser and manually evaluate if I should use the cached image or reload it from the server?

  2. Since there are many separate images to be loaded, I also parallelize image downloads over different hostnames (i.e. the image01.example.com, image02.example.com, but all these hostnames resolve to the same physical server). Since the NN of the hostname is generated randomly, I also get cache misses where I could have retrieved the up-to-date image from the browser cache. Should I abandon this practice and replace it with something else?

  3. What cache control techniques and further reading material would you recommend to use?

A: 
  1. To force a load, add a nonsense parameter to the URL

     <img src='http://whatever/foo.png?x=random'&gt;
    

    where "random" would be something like a millisecond timestamp. Now, if what you want is to have the image be reloaded only if it's changed, then you have to make sure your server is setting up "Etag" values for the images, and that it's using appropriate expiration and "if modified since" headers. Ultimately you can't take the cache control away from the browser in any way other than your HTTP headers.

  2. Instead of generating NN randomly, generate it from a hash of the image name. That way the same image name will always map to the same hostname, and you'll still have images distributed across them.

  3. I don't have a good suggestion but web implementation advice is abundant on the Internet, so I'd say start with Google.

Pointy
(1) But wouldn't that effectively disable caching for this image? I.e. if I add a random nonsense parameter and refresh the page, the image will not be loaded from cache since this parameter will change, obviously. Besides, it's still unclear to me how to check if the image on the server has been modified and we need to force its reload.
David Parunakian
That's what you asked, I thought. I'll add to my answer.
Pointy
(1) Only change the _random_ nonsense parameter when the image does change?
w3d
Right - I interpreted the question as meaning that he *new* the image was changed. If, however, he wants to automate that more and the images are static files, then he'd use the cache control headers.
Pointy