views:

1148

answers:

2

I need to create an web module, with this module i need to fetch the title of some web site, after i will find that title i need to store that in thread safe caching mechanism and i need to save there the 10 lat fetched titles.

Any help please ?

+3  A: 

Writing some locking code would be fairly easy except for...

How do you want to retrieve it? Do you want to be able to enumerate (foreach) over the list in a thread-safe fashion? There are a number of different ways to do that part, each with trade-offs.

You could go with the default behavior This probably won't work well -- you'll get an exception if someone changes the list while you are enumerating it.

You could lock the collection during the whole course of the enumeration. This means that any thread attempting to add to your cache will be blocked until the foreach loop exits.

You could copy the collection internally each time you enumerate it and enumerate the copy. This means that if someone adds to your list while you are enumerating it, you won't "see" the change.

For a list of ten, I'd go with the last option. (copy internally).

You code would look something like this:


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace Enumerator
{
    class Program
    {
        static void Main(string[] args)
        {
            MyCache<string> cache = new MyCache<string>();
            cache.Add("test");
            foreach (string item in cache)
                Console.WriteLine(item);
            Console.ReadLine();
        }
    }

    public class MyCache<T>: System.Collections.IEnumerable
    {
        private  readonly LinkedList<T> InternalCache = new LinkedList<T>();
        private  readonly object _Lock = new Object();

        public  void Add(T item)
        {
            lock (_Lock)
            {
                if (InternalCache.Count == 10)
                    InternalCache.RemoveLast();
                InternalCache.AddFirst(item);
            }
        }

        #region IEnumerable Members

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            // copy the internal cache to an array.  We'll really be enumerating that array
            // our enumeration won't block subsequent calls to Add, but this "foreach" instance won't see Adds either
            lock (_Lock)
            {
                T[] enumeration = new T[InternalCache.Count];
                InternalCache.CopyTo(enumeration, 0);
                return enumeration.GetEnumerator();
            }
        }

        #endregion

    }

}

<B>EDIT 1:</B> After sharing some comments with Rob Levine (below), I thought I'd throw a couple other alternatives out there.

This version allows you to iterate the collection lock-free. However, the Add() method is a little more expensive, as it must copy the list (moved the expense off of the Enumerate, and onto the add).


    public class Cache2<T>: IEnumerable<T>
    {
        // changes occur to this list, and it is copied to ModifyableList
        private LinkedList<T> ModifyableList = new LinkedList<T>();

        // This list is the one that is iterated by GetEnumerator
        private volatile LinkedList<T> EnumeratedList = new LinkedList<T>();

        private readonly object LockObj = new object();

        public void Add(T item)
        {
            // on an add, we swap out the list that is being enumerated
            lock (LockObj)
            {
                if (this.ModifyableList.Count == 10)
                    this.ModifyableList.RemoveLast();

                this.ModifyableList.AddFirst(item);
                this.EnumeratedList = this.ModifyableList;
                // the copy needs to happen within the lock, so that threaded calls to Add() remain consistent
                this.ModifyableList = new LinkedList<T>(this.ModifyableList);
            }

        }

        #region IEnumerable<T> Members

        IEnumerator<T> IEnumerable<T>.GetEnumerator()
        {
            IEnumerable<T> enumerable = this.EnumeratedList;
            return enumerable.GetEnumerator();
        }

        #endregion

        #region IEnumerable Members

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            System.Collections.IEnumerable enumerable = this.EnumeratedList;
            return enumerable.GetEnumerator();
        }

        #endregion
    }

<B>Edit 2:</B> In the last example, we had a really inexpensive iteration, with the trade-off being a more expensive call to Add(). Next, I thought about using a ReaderWriterLockSlim (this is a .Net 3.5 object -- the old ReaderWriterLock offered pretty poor performance)

With this model, the Add() method is less expensive than the previous model (although Add still has to take an exclusive lock). With this model, we don't have to create copies of the list. When we enumerate the list, we enter a readlock, which does not block other readers, but does block/is blocked by writers (calls to Add). As to which model is better -- it probably depends upon how you are using the cache. I would recommend Testing and measuring.


    public class Cache3<T> : IEnumerable<T>
    {
        private LinkedList<T> InternalCache = new LinkedList<T>();
        private readonly System.Threading.ReaderWriterLockSlim LockObj = new System.Threading.ReaderWriterLockSlim();

        public void Add(T item)
        {
            this.LockObj.EnterWriteLock();
            try
            {
                if(this.InternalCache.Count == 10)
                    this.InternalCache.RemoveLast();

                this.InternalCache.AddFirst(item);
            }
            finally
            {
                this.LockObj.ExitWriteLock();
            }
        }

        #region IEnumerable<T> Members

        IEnumerator<T> IEnumerable<T>.GetEnumerator()
        {
            this.LockObj.EnterReadLock();
            try
            {
                foreach(T item in this.InternalCache)
                    yield return item;
            }
            finally
            {
                this.LockObj.ExitReadLock();
            }
        }

        #endregion

        #region IEnumerable Members

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            this.LockObj.EnterReadLock();
            try
            {
                foreach (T item in this.InternalCache)
                    yield return item;
            }
            finally
            {
                this.LockObj.ExitReadLock();
            }
        }

        #endregion
    }


JMarsch
The trouble with this implementation is that it takes a lock on every call to GetEnumerator, and so is certainly not optimal in an ASP.NET environment. A better solution would be to change it so there is no lock on GetEnumerator(); have Add work on one copy of the LinkedList, and GetEnumerator add to another (copy). When Add completes, it should update GetEnumerator's linked list. This way, add always works on a seperate copy, and GetEnumerator does not require a lock.
Rob Levine
@Rob Levine: The implementation could get very interesting -- that last part: "When add completes, it should update GetEnumerator's linked list". If some one is enumerating at the time that Add completes, and you're not locking, you've corrupted state.Now, do note that while the implementation above does lock on get enum (with a likely lock convoy, no arguement), it copies the list, so you don't hold the lock for the duration of the for-each.
JMarsch
@Rob Levine: I was thinking about this some more. If, instead of modifying hte list that GetEnumerator reads, we swap it out alltogether, we can optimize for reads, and move the expense to the Add() method. Maybe this is what you meant in your orginal comment. If we do that, I think that we still need to lock in the Add method, and we move the copy operation to add. So enumerate gets cheaper, and add picks up the expense. I'll post code in another answer.
JMarsch
@Rob Levine: I was thinking about this even more (slow day, I dunno). My original code was written to keep the Adds cheaper -- If you are iterating the most recently visited links, site-wide, that list will change a lot. If we move the expense to the Add() (see my code in this answer -- I decided to keep it all in one answer), then the adds are expensive. That optimizes for fewer adds. But if we have fewer adds, then aren't we really better off using a ReaderWriterLock (particularly, the more efficient ReaderWriterLockSlim)?
JMarsch
A: 

Hi,

you might want to read up on this technique. Read-copy-update (RCU).

ChulioMartinez