views:

196

answers:

4

There seem to be two distinct ways to implement conditional requests using HTTP headers, both of which can be used for caching, range requests, concurrency control etc...:

  1. If-Unmodified-Since and If-Modified-Since, where the client sends a timestamp of the resource.
  2. If-Modified and If-None-Modified, where the client sends an ETag representation of the resource.

In both cases, the client sends a piece of information it has about the resource, which allows the server to determine whether the resource has changed since the client last saw it. The server then decides whether to execute the request depending on the conditional header supplied by the client.

I don't understand why two separate approaches are available. Surely, ETags supersede timestamps, since the server could quite easily choose to generate ETags from timestamps.

So, my questions are:

  • In which scenarios might you favour If-Unmodified-Since/If-Modified-Since over ETags?
  • In which scenarios might you need both?
+1  A: 

Simple reason: backward-compatibility.

Max Shawabkeh
Ok. Are you saying that If-Unmodified-Since and If-Modified-Since are legacy, and therefore in a scenario where you have full control over the headers sent by the client and the headers processed by the server, you should only use ETags?
rewbs
For validation and/or conditional requests, ETag is the way to go. However, Last-Modified headers have semantic meaning outside of conditional requests and should be sent too if you have control over everything. See http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.4
Max Shawabkeh
+3  A: 

I once pondered the same thing, and realized that there is one difference that is quite important: Dates can be ordered, ETags can not.

This means that if some resource was modified a year ago, but never since, and we know it. Then we can correctly answer an If-Unmodified-Since request for arbitrary dates the last year and agree that sure... it has been unmodified since that date.

An Etag is only comparable for identity. Either it is the same or it is not. If you have the same resource as above, and during the year the docroot has been moved to a new disk and filesystem, giving all files new inodes but preserving modification dates. And someone had based the ETags on file's inode number. Then we can't say that the old ETag is still okay, without having a log of past-still-okay-ETags.

So I don't see them as one obsoleting the other. They are for different situations. Either you can easily get a Last-Modified date of all the data in the page you're about to serve, or you can easily get an ETag for what you will serve.

If you have a dynamic webpage with data from lots of db lookups it might be difficult to tell what the Last-Modified date is without making your database contain lots of modification dates. But you can always make an md5 checksum of the result rendered page.

When supporting these cache protocols I definitely go for only one of them, never both.

Christian
Upvoted, that's a really good point.But is it always true that ETags are only comparable for identity? Doesn't that depend on what the server uses as Etag values? For example, as I understand it, a server could choose to use raw timestamps as ETags for some resources. This way, the ETags would be ordered for those resources.So if ETags can effectively *be* timestamps, what is the point of a separate pair of header that only work with timestamps?
rewbs
You need to add proxies into the equation. If your Etag values have semantic meaning, they dont now about it.
Christian
A: 

The ETag is server specific - this means that if your application is on a web farm you would end up with different ETags on the different servers of the web farm. So in such case the If-Unmodified-Since/If-Modified-Since should be used. You can take a look at the more detailed explanation on the YSlow rules page.

Pavel Nikolov
That is not generally true, even though you might end up with this if e.g. you have identical files on different hard drives served by different web servers.
Stefan Tilkov
As I said, when using a web farm (multiple servers) - the ETags are different on the different servers.
Pavel Nikolov
They are only different if it has been implemented that way. YSlow mentions a default implementation in apache that uses file inodes as part of the ETag that makes it so. The content of the ETag is opaque and better implementation could use a checksum over the content served.
Christian
@Christian - You are right. Comparing the content checksum is the best way to find whether it has changed.
Pavel Nikolov
+1  A: 

There is one rather big difference: I can only use ETags if I have already asked the server for one in the past. Timestamps, OTOH, I can make up as I go along.

Jörg W Mittag