views:

26

answers:

2

If I set this for cache control on my site:

Header unset Pragma
FileETag None
Header unset ETag

# 1 YEAR
<FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|swf|mp3|mp4)$">
Header set Cache-Control "public"
Header set Expires "Thu, 15 Apr 2010 20:00:00 GMT"
Header unset Last-Modified
</FilesMatch>

# 2 HOURS
<FilesMatch "\.(html|htm|xml|txt|xsl)$">
Header set Cache-Control "max-age=7200, must-revalidate"
</FilesMatch>

# CACHED FOREVER
# MOD_REWRITE TO RENAME EVERY CHANGE
<FilesMatch "\.(js|css)$">
Header set Cache-Control "public"
Header set Expires "Thu, 15 Apr 2010 20:00:00 GMT"
Header unset Last-Modified
</FilesMatch>

...then what if I update any css or image or other files, will the users browser still use the caches version until it expires (a year later)?

Thanks

+1  A: 

Yes, a response with an expiration date in the future will be considered as fresh until the expiration date:

The Expires entity-header field gives the date/time after which the response is considered stale. […]

The presence of an Expires header field with a date value of some time in the future on a response that otherwise would by default be non-cacheable indicates that the response is cacheable, unless indicated otherwise by a Cache-Control header field (section 14.9).

Note that an expiration date more than one year in the future may be interpreted as never expires:

To mark a response as "never expires," an origin server sends an Expires date approximately one year from the time the response is sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one year in the future.

So if a cache has the response stored, it will probably take the response from the cache even without revalidating the cached response before sending it.

Now if you change a resource that is already stored in caches and still fresh, there is no way to invalidate them:

[…] although they might continue to be "fresh," they do not accurately reflect what the origin server would return for a new request on that resource.

There is no way for the HTTP protocol to guarantee that all such cache entries are marked invalid. For example, the request that caused the change at the origin server might not have gone through the proxy where a cache entry is stored.

This is the reason for why such never expiring resources use a unique version number in the URL (e.g. style-v123.css) that is changed with each update. This is also what I recommend in this case.

By the way, declaring the response with Cache-Control as public doesn’t do anything in this case. This is only used when a response that required authorization should be cacheable:

public  –  Indicates that the response MAY be cached by any cache, even if it would normally be non-cacheable or cacheable only within a non- shared cache. (See also Authorization, section 14.8, for additional details.)

For further information on HTTP caching:

Gumbo
+1  A: 

Your css, js and image files will never be cached, as you are setting a date in the past.

I assume this is a mistake, and you intended to set it for a year in the future, this is one reason to favour max-age over expires.

If this was the case, then your images will be cached up to a year. It's allowable to drop something out of the cache at any time, for example to clean out less-frequently used entries to reduce the size on disk that the cache is taking up.

There are two possible approaches to deal with the possibility of reducing the risk of staleness. One is to set a much lower expiry time, and use e-tags and modification dates so that after that expiry time has past you can send a 304 if there is no change, so the server need send only a few bytes rather than the entire entity.

The other is to keep the expiry at a year, but to change the URI used when you change. This can be useful in the case of e.g. a large file that is used on almost every page on your site. It requires that you change all references to that resource when it does change (because you are essentially changing to use a new resource), which can be fiddly and therefore is only advised as an optimisation in a few hotspot cases. If a file ignores query attributes (e.g. it's just served straight from a file) the browser won't know that, hence you could use something like /scripts/bigScript.js?version=1.2.3 and then change to /scripts/bigScript.js?version=1.2.4 when you change bigScript.js. This will have no effect on bigScript.js, but will cause the browser to get a new file, as for all it knows it's a completely different resource.

Jon Hanna
Aha, so if I set expiry to a year or 'far future', and incase I have to change some of the cached files, I will just add attributes like `style.css?ver=031010` and it'll grab this new file? Is this action cross-browse compatible?
Nimbuz
It's totally cross-browser compatible. Remember that for all the browser knows, that query string is used by the server - so it can't assume style.css?ver=020123 is the same as style.css?ver=031010 and has to get the file again. The server is using the same file (only the more recent version of course) and ignoring the query string, but it **could** be doing something else. It's only worth doing on heavily hit "library" files.
Jon Hanna
Great, just what I wanted to know. Thanks! Also, my static files (css, images) are on a different domain than where the HTML is hosted, so where does this htaccess go?
Nimbuz
Wherever they are. The .htaccess (or other methods of controlling the headers sent, what I've said here applies to all webservers, though the mechanisms involved in setting them differ, and one can also programatically override them) will have to be on the server that actually sends the file in question.
Jon Hanna