views:

472

answers:

3

The Scenario

I am building a custom CMS for a communications application. The users can upload images to the server and then later reference them in posts they write using markdown. Images are stored in a database and referenced with a url of the form images/{id}. A custom image handler retrieves these and sets a far future expires header so they aren't fetched over and over again. Storing the images in the file system is not an option per the customer.

Posts are stored in the database as markdown and as html for performance.

Markdown

###Header
Lorem Ipsum dolor sit amet.  ![funny cat](images/25)

HTML

<h3>Header</h3>
<p>
  Lorem Ipsum dolor sit amet. <img alt="funny cat" src="images/25" />
</p>

The Problem

These images are also editable. From a caching perspective this presents a problem. When an image is edited I need to ensure that the browser gets the latest version. I have come up with the following solutions, all of which I have found to be lacking.

Possible Solutions

Version Field

Store a version field with the image. Increment it when the image is edited producing urls of the form images/{id}/version/{version}.

This is nice if image urls are always generated from the database. However, I am storing urls in posts as text and would have to preprocess possibly large swaths of text for these requests. Also, linking an image would be troublesome since a url becomes stale after an image is edited. This is probably not a good idea.

New Url

When an image is edited, store it as a new entry in the database.

This means no version upkeep, but old links and old posts suffer from the same issues. They're never updated. The performance would be good, but the problem remains.

304 - Not Modified

Store a last edited field for each image. When a conditional request comes in return http status 304 - Not Modified, reducing bandwidth usage.

This seems like the best solution I've got so far. The image is cached unless edited, but I've never used this approach before. If there are 50 images on the page there are still 50 requests to the server. Bandwidth will be reduced, but the latency remains. Is this approach worth that cost?

The Question

I have very little experience in this area and I'm hoping all of you have more. Are any of the above solutions viable? If so what is your experience with them? If not what is a better approach?

A: 

Put the images on a disk and let your web server deal with it.

The file can keep the same name (and hence URL), and the web server can deal with the modified headers from the browsers. Also browsers can be a litte happier with URLS with known extensions.

Putting images in a database and serving them via a script is not generally a good idea. Just have a reference to the URL in the database.

Jeremy French
This is definitely an option, but makes scaling to multiple servers much more difficult.
brad
A: 

Put the images as files on your server and reference them in your CMS via regular URLs with an added parameter that equals their version (e.g. http://www.example.com/images/lolcat.jpg?20055). If your webserver is correctly configured this will ensure that clients will cache the images and always use the correct version.

Ronny Vindenes
This is actually what I'm doing currently (only from the database). The only problem is that for stored posts I will need to do preprocessing, which could be rather costly.
brad
+1  A: 

As you've said, Conditional GET it is the best option, as you only need to check one request header (If-None-Match or If-Modified-Since) and send back the corresponding status code (200 or 304) and a header (E-Tag or Last-Modified). Then the browser will deal with the image cache.

Also, depending on your server framework, if you store your images in the database you could cache them the first time you need them (for a finite amount of time, so you don't consume all your server memory). This helps when you need to read the same image repetitively, saving you database queries. Of course you need to remove your image from the server cache if you edit them.

EDIT: thanks for accepting the answer. In case you need it, here's a nice explanatory link: HTTP Conditional Get for RSS Hackers.

Leandro López