views:

382

answers:

5

Okay I know I asked about this before, and the answer was basically cache data that doesn't change often.

Well what does one do when at least 99.9% of the data changes?

In my project the only tables that doesn't get updated or won't get updated frequently would be the member profile info (name/address, and settings)

So how does one still enable some kind of caching but keep and make sure the data being viewed is updated when changes are applied?

+7  A: 

I guess, it's not really 99.9% of all data that changes, but it's in 99.9% of all data locations that changes happen.

For example, if you are running a bulletin board, that means that there will be a steady stream of new posts, but old posts will remain the same, and even old threads will stay unchanged for a long, long time.

In that case, you'll need a way to invalidate old cached data, so that you can build a cache as soon as a thread (in the example) is viewed. If there is a change to ONE of these threads (i.e. when someone adds a new post), this one cached item is deleted/marked outdated, so the next time it is viewed, it will be rebuilt. Other items that still haven't changed will use the cache, though.

BlaM
+1  A: 

If your data changes every time it's looked at, that defeats the point of caching - you should look for other types of optimization.

If it changes every other time it's looked at, then it might still not be worth caching - don't forget that storing something to a cache incurs some overhead

Greg
A: 

Hi,

When in a single-instance web-application scenario you could manually update the cache whenever an object changes, as well as in the database.

Once in a while you could flush the cache to make sure nothing is out-of-sync (eg. if the database was updated by some other application).

Note that this wouldn't work well for enterprise-scale scenarios.

JacobE
+1  A: 

It depends.

If you're willing to make that sacrifice (i.e. things got real bad, performance-wise), you might want to consider caching the data for small intervals (10 secs, 30 secs, 1 min, ...) to decrease the load on your database. The data will not be the freshest, but it may just be fresh enough.

If there's hardly any load, there's no need to start fiddling with the cache. Really, no need to find problems where they don't exist. After all - many database engines have their own caches, not only for the data, but for the execution plan as well.

Omer van Kloeten
+1  A: 

The really relevant consideration is how often does this 99.9% of info change, is it every access, every second, minute, hour? Anything above an access, depending on the amount of requests per time unit you have and the caching scheme you choose, might make sense.

As long as the average time of retrieval diminishes, cache. You have to measure this and decide.

Some cache validation techniques are to check validity upon access or to store a timestamp and expiry with the cache. Check validity upon each access is more expensive, and timestamp+expiry might serve stale content in some cases.

Vinko Vrsalovic