views:

759

answers:

4

In the Apache's mod_expires module, there is the Expires directive with two base time periods: access, and modification

ExpiresByType text/html "access plus 30 days"
understandably means, the cache will request for fresh content after 30 days.

However,
ExpiresByType text/html "modification plus 2 hours"
doesn't make intuitive sense.



How does the browser cache know that the file has been modified unless it makes a request to the server. And if it is making a call to the server, what is the use of caching this directive -- It seems to me that I am not understanding some crucial part of caching--- please enlighten me.

A: 

My understanding is that modification asks the browser to base the cache time based on the Last-Modificatied HTTP header's value. So, modification plus 2 hours would be the Last-Modificatied time + 2 hours.

NilObject
A: 

The server sends a header such as: "Last-Modified: Wed, 18 Feb 2009 00:00:00 GMT". The cache behaves based on either this header or the access time.

Say if the content is expected to be refreshed every day, then you want it to expire "modification plus 24 hours".

If you don't know when the content will be refreshed, then it's better to base it on the access time.

Andrew Vit
Hi Andrew, Thanks for ur answer. When and How often does the server send Last Modified header? or does it happen during a browser session
+7  A: 

An Expires* directive with "modification" as its base refers to the modification time of the file on the server. So if you set, say, "modification plus 2 hours", any browser that requests content within 2 hours after the file is modified (on the server) will cache that content until 2 hours after the file's modification time. And the browser knows when that time is because the server sends an Expires header with the proper expiration time.

Let me explain with an example: say your Apache configuration includes the line

ExpiresDefault modification plus 2 hours

and you have a file index.html, which the ExpiresDefault directive applies to, on the server. Suppose you upload a version of index.html at 9:53 GMT, overwriting the previous existing index.html (if there was one). So now the modification time of index.html is 9:53 GMT. If you were running ls -l on the server (or dir on Windows), you would see it in the listing:

-rw-r--r--  1 apache apache    4096  Feb 18 09:53 index.html

Now, with every request, Apache sends the Last-Modified header with the last modification time of the file. Since you have that ExpiresDefault directive, it will also send the Expires header with a time equal to the modification time of the file (9:53) plus two hours. So here is part of what the browser sees:

Last-Modified: Wed, 18 Feb 2009 09:53:00 GMT
Expires: Wed, 18 Feb 2009 11:53:00 GMT

If the time at which the browser makes this request is before 11:53 GMT, the browser will cache the page, because it has not yet expired. So if the user first visits the page at 11:00 GMT, and then goes to the same page again at 11:30 GMT, the browser will see that its cached version is still valid and will not (or rather, is allowed not to) make a new HTTP request. If the user goes to the page a third time at 12:00 GMT, the browser sees that its cached version has now expired (it's after 11:53) so it discards the cache and requests a new version of the page. Of course, if you haven't changed the file in the meantime, Apache will send back the same page, with the same values for the Last-Modified and Expires headers - the modification time of the file hasn't changed. This time, though, the browser sees that the value of the Expires header is before the current time (11:53 < 12:00) so it doesn't cache the page at all.

Now, let's pretend instead that you uploaded a new version of the page at 11:57. In this case, the last modification time of the file becomes 11:57, and Apache calculates the expiration time as 11:57 + 2:00 = 13:57 GMT. So now, when the browser requests a new version of the page at 12:00, it gets these two headers instead of the two listed above:

Last-Modified: Wed, 18 Feb 2009 11:57:00 GMT
Expires: Wed, 18 Feb 2009 13:57:00 GMT

And now it sees that the expiration time is greater than the current time (13:57 > 12:00) so it caches the page, and the cycle repeats...

(Note of course that many other things are sent along with the two headers I listed above, I just trimmed out all the rest for simplicity)

David Zaslavsky
Hi David, this makes sense, however I am still not sure, why and how the server knows to send the browser. So if I understand correct, the next time the browser is requesting the resource, the server somehow sends information to the browser about the files modification status-- but isnt this a get
I figured this would be easiest to explain with an example, so I edited one in...
David Zaslavsky
awesome thanks for your time
A: 

So now I have a question around this. I have changed a .js file in my webpage so I want that every user that go to my webpage request the new .js file and discard that file from their cache. But I need to do it only once per user, I mean;

When John comes to my page tomorrow, I want him to discard his cache and ask for the new .js file but if he wants to access again 5 hours later or 8 hours later I would like him NOT to request the file but using the version of his cache.

How can I do it?

Mike