views:

224

answers:

4

I have a PHP file that will return the same thing with the same $_GET parameters every time -- it's deterministic.

Unfortunately for efficiency (this file is requested very often), Apache defaults to a "200 OK" response whenever a PHP page is requested, making the user download the file again.

Is there any way to send a 304 Not Modified header if and only if the parameters are the same?

Bonus: Can I set an expiry time on it, so that if the cached page is more than, say, three days old, it sends a "200 OK" response?

A: 

Have you tried header("HTTP/1.0 304 Not Modified"); in your PHP code that is getting called? If unfamiliar you will want to put that in your code BEFORE you start outputting anything to the buffer.

http://php.net/manual/en/function.header.php

manyxcxi
I should be called every time when script is accessed?
Col. Shrapnel
+1  A: 

In general, you return HTTP status codes using the Header function:

Header("HTTP/1.1 304 Not Modified");
exit();

However, this alone isn't enough.

The problem is that you don't know how requested the file, so you'll need a bit of browser cooperation.

You can look for If-modified-since headers in the incoming request, and return the appropriate status code if it's present and within date range.

If you send a proper "Expires" header when you initially generate the PHP, then the browser or proxy cache may decide to not fetch the request at all (although more likely, they'll set the If-modified-since header). Without an Expires header, the browser will likely always re-try the full request.

For more information, see http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html and search for "14.25"

The browser will do the mapping of GET parameters to cached copy, btw. You don't need to do any work there.

Jon Watte
+1  A: 

Google for the PHP conditional get.

Col. Shrapnel
+2  A: 

Without caching the page yourself (or at least its Etag) you cannot really make use of the 304. A full fledged caching algorithm is somewhat out of scope, but the general idea:

<?php 
function getUrlEtag($url){
    //some logic to get an etag, possibly stored in memcached / database / file etc.
}
function setUrlEtag($url,$etag){
    //some logic to get an etag, possibly stored in memcached / database / file etc.
}
function getPageCache($url,$etag=''){
    //[optional]some logic to get the page from cache instead, possibly not even using etag
}
function setPageCache($url,$content,$etag=''){
    //[optional]some logic to save the page to cache, possibly not even using etag
}
ob_start();
$etag = getUrlEtag($_SERVER['REQUEST_URI']);
if(isset($_SERVER['HTTP_IF_NONE_MATCH']) && trim($_SERVER['HTTP_IF_NONE_MATCH']) == $etag) { 
    header("HTTP/1.1 304 Not Modified"); 
    exit; 
}
if(($content=getPageCache($_SERVER['REQUEST_URI'],$etag))!==false){
    echo $content;
    exit;
}
?>
//the actual page
<?php
$content = ob_get_clean();
setUrlEtag($_SERVER['REQUEST_URI'],$etag=md5($url.$content));
function setPageCache($_SERVER['REQUEST_URI'],$content,$etag);
header("Etag: $etag");
echo $content;
?>

All common pitfalls apply: you can possibly not display cache pages for logged in users, a caching of partial content could be more desirable, you are yourself responsible for preventing stale content in the cache (possibly using triggers in backend or database on modifications, or just playing around with the getUrlEtag logic), etc. etc.

You could also play around with HTTP_IF_MODIFIED_SINCE if that's easier to control.

Wrikken
This seems like the general idea to me.
pyrony
Though for the OP it is not necessary to cache anything. According to his conditions, he can just compute a Etag based on id.
Col. Shrapnel
Yeah, the actual implementation can be as simple or as elaborate as needed in the situation. Tried to illustrate that, and yet I fell into the pitfall of providing an implementation, md5()ing over content, shouldn't be there :P
Wrikken