ansaurus

Question

Design Issue: Good or bad to make the cached entities mutable?

Answer 1

+3 A:

Look at it this way, if the entry is mutable, then it is likely that the hashcode will change when the object is mutated.

Depending on the dictionary implementation of the cache, it could either:

be 'lost'
in worst case the entire cache will need to be rehashed

There may be valid reasons why you want 'mutable hashcodes' but I cannot see a justification here. (I have only ever needed to do this once in the last 9 years).

It would be a lot easier just to remove and replace the entry you wish to be 'mutated'.

leppie 2010-09-29 07:35:04

Thanks leppie. And sorry for my poor descritions. Here my problem is not about the hashcode, please see the "Update" for detail, thanks :)

Dylan Lin 2010-09-29 09:00:14

@Dylan Lin: Any worthwhile cache will use a hashtable of sorts internally. Having anything worse than O(1) lookups would be extremely inefficient.

leppie 2010-09-29 09:39:19

I'm assuming that @Dylan Lin means that the values will be mutable, rather than the key data, which hence does mean that the hashcode is not the problem, as it remains immutable either way.

Jon Hanna 2010-09-29 10:17:01

@Jon Hanna, you're right@leppie I'm not going to develop a generic caching framework. What I want is to expose the full cached page tree (already retrieved from the caching framework) to the developers, so they can use it easily. No need to concern about the underlying cache API too much.

Dylan Lin 2010-09-30 03:40:08

Answer 2

+1 A:

First, I assume you mean the cached values may or may not be mutable, rather than the identifier it is identified by. If you mean the identifier too, then I would be quite emphatic about being immutable in this regard (emphatic enough to have my post flagged for obscene language).

As for mutable values, there is no one right answer here. You've hit on the primary pro and con either way, and there are multiple variants within each of the two options you describe. Cache invalidation is in general a notoriously difficult problem (as in the well known quote from Phil Karlton, "There are only two hard problems in Computer Science: cache invalidation and naming things."*)

Some things to consider:

How often will changes be made. If changes are rare, refreshes become easy - dump the existing cache and let it rebuild.
Will the CMS be on multiple servers, or could it in the future, as this means that any invalidation information has to be shared.
How bad is stale data, and how soon is it bad (could you happily server out of date values for the next hour or so, or would this conflict disastrously with fresh values).
Does a revalidation approach make sense for you, where after a certain time a cached value is checked to be sure it is still valid, and the time-to-next-check is updated (alternatively, periodically dump old values and let them be retrieved from the fresh source again).
How easy is detecting staleness in the first place? If its hard this can rule out some approaches.
How much does the cache actually save. Could you just get rid of it?

I haven't mentioned threading issues, because the threading issues are difficult with any sort of cache unless you're single-threaded (and if its a CMS I'm guessing it's web, and hence inherently multi-threaded). One thing I'll will say on the matter is that it's generally the case that a cache failure isn't critical (by definition, cache failure has a fallback - get the fresh value) for this reason it can be fruitful to take an approach where rather than blocking indefinitely on the monitor (which is what lock does internally) you use Montior.TryEnter with a timeout, and have the cache operation fail if the timeout is hit. Using a ReaderWriterLockSlim and allowing a slightly longer timeout for writing can be a good approach. This way if you get a point of heavy lock contention then the cache will stop working for some threads, but those threads still get usable data. This will suck for performance for those threads, but not as much as lock contention would cause for all affected threads, and caches are a place where it is very easy to introduce lock contention into a web project that only hits once you've gone live.

*(and of course the well known variant, "there are only two hard problems in Computer Science: cache invalidation, naming things, and off-by-one errors").

Jon Hanna 2010-09-29 10:38:46

Thanks Jon. Your answer is useful, thanks :)But in this post, my foremost concern is, how to design the page tree API for the plugin authors? make the tree mutable may bring us subtle bugs, make it immutable may make us hard to reset the properties (because there's no setters outside the PageCacheItem now).

Dylan Lin 2010-09-30 04:00:27

Yes, that's true. There isn't a magic silver-bullet solution here, and you're going to have to balance the pros and cons and make a choice, but neither will be perfect. Possibly also, the best option won't be the best option down the line, as features are added and usage patterns change.

Jon Hanna 2010-09-30 07:28:04

ansaurus

tags:

views:

answers:

Design Issue: Good or bad to make the cached entities mutable?

related questions