views:

443

answers:

1

After searching a lot, reading every tutorials I've found and asking some questions here, I've finally managed to answer corrctly (at least I think) to if-none-match and if-modified-since HTTP requests.

To do a quick recap, this is what I do on every pages cacheable:

session_cache_limiter('public'); //Cache on clients and proxies
session_cache_expire(180); //3 hours
header('Content-Type: ' . $documentMimeType . '; charset=' . $charset);
header('ETag: "' . $eTag . '"'); //$eTag is a MD5 of $currentLanguage + $lastModified
if ($isXML)
    header('Vary: Accept'); //$documentMimeType can be either application/xhtml+xml or text/html for XHTML (based on $_SERVER['HTTP_ACCEPT'])
header('Last-Modified: ' . $lastModified);
header('Content-Language: ' . $currentLanguage);

Also, every page have it's own URL (for every languages). For example, "index.php" will be served under URL "/en/home" in English and "/fr/accueil" in French.

My big problem was to answer a "304 Not Modified" to if-none-match and if-modified-since HTTP requests only when needed.

The best doc I've found is: http://rithiur.anthd.com/tutorials/conditionalget.php

And this is the implementation I did of it (this piece of code is called ASAP on pages that can be cached):

$ifNoneMatch = array_key_exists('HTTP_IF_NONE_MATCH', $_SERVER) ? $_SERVER['HTTP_IF_NONE_MATCH'] : false;
$ifModifiedSince = array_key_exists('HTTP_IF_MODIFIED_SINCE', $_SERVER) ? $_SERVER['HTTP_IF_MODIFIED_SINCE'] : false;

if ($ifNoneMatch !== false && $ifModifiedSince !== false)
{
    //Both if-none-match and if-modified-since were received.
    //They must match the document values in order to send a HTTP 304 answer.
    if ($ifNoneMatch == $eTag && $ifModifiedSince == $lastModified)
    {
        header('Not Modified', true, 304);
        exit();
    }
}
else
{
    //Only one header received, it it match the document value, send a HTTP 304 answer.
    if (($ifNoneMatch !== false && $ifNoneMatch == $eTag) || ($ifModifiedSince !== false && $ifModifiedSince == $lastModified))
    {
        header('Not Modified', true, 304);
        exit();
    }
}

My question is two fold:

  • Is it the correct way to do it? I mean when if-none-match and if-modified-since are sent, both must match to answer a 304, and if only one of the two is sent, only matching this one is OK to send a 304?
  • When used in the context described here, is these 2 snippets are public cache friendly (I mean cache friendly on proxies and Web browsers)?

BTW, I use PHP 5.1.0+ only (I don't support versions lower that that).

Edit: Added bounty... I expect quality answer. Don't answer/vote if you are guessing something!

+15  A: 
  • It's not quite correct. Please take a look at the algorithm: alt text
  • The solution is proxy-friendly, you may use Cache-control: proxy-revalidate to force caches to obey any freshness information you give them about a resource (only applies to shared|proxy caches)

Here is the function that might help:

function isModified($mtime, $etag) {
    return !( (
        isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])
        && 
        strtotime($_SERVER['HTTP_IF_MODIFIED_SINCE']) >= $mtime
    ) || (
        isset($_SERVER['HTTP_IF_NONE_MATCH'])
        && 
        $_SERVER['HTTP_IF_NONE_MATCH'] == $etag
    ) ) ;
}

I suggest that you take a look at the following article: http://www.peej.co.uk/articles/http-caching.html

Update:

[AlexV] Is is even possible to receive if-none-match AND if-modified-since at the same time?

You can definitely have both set. However:

If none of the entity tags match, then the server MAY perform the requested method as if the If-None-Match header field did not exist, but MUST also ignore any If-Modified-Since header field(s) in the request. That is, if no entity tags match, then the server MUST NOT return a 304 (Not Modified) response.

RFC2616 #14.26

Example values (W stands for 'weak'; read more in RFC2616 #13.3.3):

If-None-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
If-None-Match: W/"xyzzy", W/"r2d2xxxx", W/"c3piozzzz"
If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT
If-None-Match: *

As a special case, the value "*" matches any current entity of the resource.

St.Woland
Note also what RFC2616 (HTTP1.1) spec section 14.26 says about the syntax of If-None-Match. Neither this nor the original question allowed for multiple etag values or "*". That being said, this assumption is probably right 99.9% of the time.
MZB
@Mike Bell: Yeah I know that ETag can contain many values or * but I don't know any browser using this yet... I wonder if any system actually use this. BTW how to parse ETag when more than 1 value in it? What does * mean in ETag?
AlexV
@St.Woland: I read the article provided and looked at your graph (nice work) and it seems to correspond to my "else" part in my 2nd snippet (am I right?). What I find odd with this is when BOTH if-none-match and if-modified-since are received only 1 of the 2 must match to reply with a 304... Is is even possible to receive if-none-match AND if-modified-since at the same time?
AlexV
@AlexV: you can definitely have both set. I updated the answer, because there was too much for a comment.
St.Woland
I would expect a caching proxy to be able to present multiple ETag values, for different combinations of headers subject to Vary.
Andrew Aylett