views:

631

answers:

2

I use PHP to generate dynamic Web pages. As stated on the following tutorial (see link below), the MIME type of XHTML documents should be "application/xhtml+xml" when $_SERVER['HTTP_ACCEPT'] allows it. Since you can serve the same page with 2 different MIMEs ("application/xhtml+xml" and "text/html") you should set the "Vary" HTTP header to "Accept". This will help the cache on proxies.

Link: http://keystonewebsites.com/articles/mime%5Ftype.php

Now I'm not sure of the implication of: header('Vary: Accept'); I'm not really sure of what 'Vary: Accept' will precisely do...

The only explaination I found is:

After the Content-Type header, a Vary header is sent to (if I understand it correctly) tell intermediate caches, like proxy servers, that the content type of the document varies depending on the capabilities of the client which requests the document. http://www.456bereastreet.com/archive/200408/content%5Fnegotiation/

Anyone can give me a "real" explanation of this header (with that value). I think I understand things like: Vary: Accept-Encoding where the cache on proxies could be based on the encoding of the page served, but I don't understand: Vary: Accept

Any HTTP guru can help?

Thanks!

+4  A: 
  • The cache-control header is the primary mechanism for an HTTP server to tell a caching proxy the "freshness" of a response. (i.e., how/if long to store the response in the cache)

  • In some situations, cache-control directives are insufficient. A discussion from the HTTP working group is archived here, describing a page that changes only with language. This is not the correct use case for the vary header, but the context is valuable for our discussion. (Although I believe the Vary header would solve the problem in that case, there is a Better Way.) From that page:

Vary is strictly for those cases where it's hopeless or excessively complicated for a proxy to replicate what the server would do.

  • This page describes the header usage from the server perspective, this one from a caching proxy perspective. It's intended to specify a set of HTTP request headers that determine uniqueness of a request.

A contrived example:

Your HTTP server has a large landing page. You have two slightly different pages with the same URL, depending if the user has been there before. You distinguish between requests and a user's "visit count" based on Cookies. But -- since your server's landing page is so large, you want intermediary proxies to cache the response if possible.

The URL, Last-Modified and Cache-Control headers are insufficient to give this insight to a caching proxy, but if you add Vary: Cookie, the cache engine will add the Cookie header to it's caching decisions.

Finally, for small traffic, dynamic web sites -- I have always found the simple Cache-Control: no-cache, no-store and Pragma: no-cache sufficient.

Edit -- to more precisely answer your question: the HTTP request header 'Accept' defines the Content-Types a client can process. If you have two copies of the same content at the same URL, differing only in Content-Type, then using Vary: Accept could be appropriate.

J.J.
+3  A: 

Vary: Accept simply says that the response was generated based on the Accept header in the request. A request with a different Accept header might get a different response.

(You can see that the linked PHP code looks at $HTTP_ACCEPT. That's the value of the Accept request header.)

To HTTP caches, this means that the response must be cached with extra care. It is only going to be a valid match for later requests with exactly the same Accept header.

Now this only matters if the page is cacheable in the first place. By default, PHP pages aren't. A PHP page can mark the output as cacheable by sending certain headers (Expires, for example). But whether and how to do that is a different question.

Jason Orendorff