What concerns, processes, and questions do you take into account when deciding when and how to cache. Is it always a no win situation?
This presupposes you are stuck with a code base that has been optimized.
What concerns, processes, and questions do you take into account when deciding when and how to cache. Is it always a no win situation?
This presupposes you are stuck with a code base that has been optimized.
I would look at each feature of your website/application a decided for each feature:
I would personally go against caching whole pages in favour of caching sections of the website/application.
What language are you using? With ASP you have some very easy caching with only adding some property tag over the method and the value is cached depending of the time.
If you want more control over the cache, you can use some popular system like MemCached and have a control with time or by event.
First off, if your code is optimized as you said, you will only see noticable performance benefits when the site is being hammered with a lot of requests.
However, It is faster to pull resources from RAM than from the disk, so your web server will be able to handle more requests if you have a caching strategy in place.
As for knowing when you're going to need caching, consider that even low end modern web servers can handle hundreds of requests per second, so unless you expect a decent amount of traffic, caching is probably something you can just skip.
Also, if you are pulling content from your database (for example, StackOverflow probably does this) caching can be very helpful because database operations are relatively expensive and can be a huge bottleneck in high-volume situations.
As for a scenario when it's not appropriate to cache or when caching becomes difficult... If you try to cache a dynamic page that, say, displays the current date and time, you will constantly see an old date/time unless you get a little more involved with your caching strategy. So that's something to think about.
Yahoo for example "versions" their JavaScript, so your browser downloads code-1.2.3.js and when a new version appears they reference that version. By doing this they can make their Javascript code cacheable for a very-very long time.
As for the general answer I think it depends on your data, on how often does it change. For example, images don't change very often, but html pages do. The "About us" page doesn't change too often, but the news section does.
I have been working with DotNetNuke most recently for web applications and there are a number of things that I consider each time I implement caching solutions.
You can cache by time. This is useful for data that change fast. You can set time for 30 sec or 1 min. Of course, this require some traffic. More traffic you have, more you can play with the time because if you have 1 visit every hour, this visit will be populate the cache and not using it...
You can cache by event... if your data change, you update the cache... this is one very useful if the data need to be accurate for the user very fast.
You can cache static content that you know that won't change ofen. If you have a top 10 of the day that refresh every day, than you can stock all in the cache and update every day.
Where available, look out for whole object memory caching. In ASPNET, this is a built-in feature where you can just plant your business logic objects in the IIS Application and access them from there.
This means you can store everything you need to generate a page in memory (persisting writes to database) and generate a page without ANY database IO.
You still need to use the page-building logic to generate the page, but you save a lot of time in getting the data.
Other techniques involve localised output caching, where you capture the output before sending and save it to file. This is great for static sections (like navigation on certain pages, or text bodies) and include them out when they're requested. Most implementations purge cached objects like this when a write happens or after a certain period of time.
Then there's the least "accurate": whole page caching. It's the highest performer but it's pretty useless unless you have very simple pages.
What kind of caching? Server side caching? Client side caching?
Client side caching is a no-brainer with certain things, like Static HTML, SWFs and images. Figure out how often the assets are likely to change, and set up "Expires" headers as appropriate. (2 days? 2 weeks? 2 months?)
Dynamic pages, by definition, are a little harder to cache. There have been some explorations in caching of certain chunks using Javascript (and degrading to IFrames if JS is not available.) This however, might be a little more difficult to retrofit into an existing site.
DB and application level caching may, or may not work, depending on your situation. That really depends on where your bottlenecks are. Figuring out where your application spends the most time on page-rendering is probably priority 1, then you can start looking at where and how to cache.