views:

3977

answers:

5

I have some places where implementing some sort of cache might be useful. For example in cases of doing resource lookups based on custom strings, finding names of properties using reflection, or to have only one PropertyChangedEventArgs per property name.

A simple example of the last one:

public static class Cache
{
    private static Dictionary<string, PropertyChangedEventArgs> cache;
    static Cache()
    {
        cache = new Dictionary<string, PropertyChangedEventArgs>();
    }
    public static PropertyChangedEventArgs GetPropertyChangedEventArgsa(string propertyName)
    {
        if (cache.ContainsKey(propertyName))
            return cache[propertyName];

        return cache[propertyName] = new PropertyChangedEventArgs(propertyName);
    }
}

But, will this work well? For example if we had a whole load of different propertyNames, that would mean we would end up with a huge cache sitting there never being garbage collected or anything. I'm imagining if what is cached are larger values and if the application is a long-running one, this might end up as kind of a problem... or what do you think? How should a good cache be implemented? Is this one good enough for most purposes? Any examples of some nice cache implementations that are not too hard to understand or way too complex to implement?

+1  A: 

This is a nice debate to have, but depending your application, here's some tips:

You should define the max size of the cache, what to do with old items if your cache is full, have a scavenging strategy, determine a time to live of the object in the cache, does your cache can/must be persisted somewhere else that memory, in case of application abnormal termination, ...

Fabian Vilers
What is a scavenging strategy?
Svish
The scavenging algorithm is responsible of removing items from the cache to free cache storage resources.
Fabian Vilers
Ah, ok. That makes sense.
Svish
+6  A: 

This is a large problem, you need to determine the domain of the problem and apply the correct techniques. For instance, how would you describe the expiration of the objects? Do they become stale over a fixed interval of time? Do they become stale from an external event? How frequently does this happen? Additionally, how many objects do you have? Finally, how much does it cost to generate the object?

The simplest strategy would be to do straight memoization, as you have above. This assumes that objects never expire, and that there are not so many as to run your memory dry and that you think the cost to create these objects warrants the use of a cache to begin with.

The next layer might be to limit the number of objects, and use an implicit expiration policy, such as LRU (least recently used). To do this you'd typically use a doubly linked list in addition to your dictionary, and every time an objects is accessed it is moved to the front of the list. Then, if you need to add a new object, but it is over your limit of total objects, you'd remove from the back of the list.

Next, you might need to enforce explicit expiration, either based on time, or some external stimulus. This would require you to have some sort of expiration event that could be called.

As you can see there is alot of design in caching, so you need to understand your domain and engineer appropriately. You did not provide enough detail for me to discuss specifics, I felt.

P.S. Please consider using Generics when defining your class so that many types of objects can be stored, thus allowing your caching code to be reused.

Adam Luter
+1 for mentioning a lot of factors and suggesting basic solutions.
SealedSun
I kind of left out spesifics just because I hoped I would get more general advice on what to think about etc. So thank you for that :) On using generics, do you mean to swap out Cache with Cache<T> and swap out all the PropertyChangedEventArgs with T as well? Or something more clever? Not all other objects makes sense to identify by a string I would think...
Svish
Just basic generics yes, you may want to use the same pattern as Dictionary<K,V> and let the user of your cache determine the type of K and V.
Adam Luter
+7  A: 

You could wrap each of your cached items in a WeakReference. This would allow the GC to reclaim items if-and-when required, however it doesn't give you any granular control of when items will disappear from the cache, or allow you to implement explicit expiration policies etc.

(Ha! I just noticed that the example given on the MSDN page is a simple caching class.)

LukeH
Cool! What exactly does the WeakReference do in this case?
Svish
This page has a good explanation: http://msdn.microsoft.com/en-us/library/ms404247.aspx
LukeH
In practice, I find that the .NET GC is so efficient/aggressive that weak references don't hang around for long, meaning that they're only really suitable for data that's short-lived and/or inexpensive to recreate.
LukeH
Yes I didn't suggest WeakReference explicitly because you have no control on when the objects expire. Thus only good when they take up alot of memory and are cheap to make -- which seems opposite to most caches.
Adam Luter
+3  A: 

This is a common problem that has many solutions depending on your application need. It is so common that Microsoft released a whole library to address it. You should check out Microsoft Velocity before rolling up your own cache. http://msdn.microsoft.com/en-us/data/cc655792.aspx Hope this help.

mfawzymkh
+1 - This library looks like it has its roots in the Caching Application Block from the enterprise library.
Andras Zoltan
A: 

I developed a .NET Caching Library that will automatically expire items in memory and in the secondary storage provider (a file, database, etc). It will do smart client side caching or distributed caching.

http://www.kellermansoftware.com/p-38-net-caching-library.aspx

Greg Finzer
Sure its commercial, but I think it will nicely solve all of the problems mentioned above. Why shouldn't Greg post this? +1 vote.
Gravitas