[EDIT]
The new Reactive Framework solves the problem outlined below, using the System.Linq.EnumerableEx.MemoizeAll()
extension method.
Internally, MemoizeAll()
uses a System.Linq.EnumerableEx.MemoizeAllEnumerable<T>
(found in the System.Interactive assembly), which is similar to my ThreadSafeCachedEnumerable<T>
(sorta).
Here's an awfully contrived example that prints the contents of an Enumerable (numbers 1-10) very slowly, then quickly prints the contents a second time (because it cached the values):
// Create an Enumerable<int> containing numbers 1-10, using Thread.Sleep() to simulate work
var slowEnum = EnumerableEx.Generate(1, currentNum => (currentNum <= 10), currentNum => currentNum, previousNum => { Thread.Sleep(250); return previousNum + 1; });
// This decorates the slow enumerable with one that will cache each value.
var cachedEnum = slowEnum.MemoizeAll();
// Print the numbers
foreach (var num in cachedEnum.Repeat(2))
{
Console.WriteLine(num);
}
[/EDIT]
Hello multi-threading gurus,
I created the ThreadSafeCachedEnumerable class intending to increase performance where long running queries where being reused. The idea was to get an enumerator from an IEnumerable and add items to a cache on each call to MoveNext(). The following is my current implementation:
/// <summary>
/// Wraps an IEnumerable<T> and provides a thread-safe means of caching the values."/>
/// </summary>
/// <typeparam name="T"></typeparam>
class ThreadSafeCachedEnumerable<T> : IEnumerable<T>
{
// An enumerator from the original IEnumerable<T>
private IEnumerator<T> enumerator;
// The items we have already cached (from this.enumerator)
private IList<T> cachedItems = new List<T>();
public ThreadSafeCachedEnumerable(IEnumerable<T> enumerable)
{
this.enumerator = enumerable.GetEnumerator();
}
#region IEnumerable<T> Members
public IEnumerator<T> GetEnumerator()
{
// The index into the sequence
int currentIndex = 0;
// We will break with yield break
while (true)
{
// The currentIndex will never be decremented,
// so we can check without locking first
if (currentIndex < this.cachedItems.Count)
{
var current = this.cachedItems[currentIndex];
currentIndex += 1;
yield return current;
}
else
{
// If !(currentIndex < this.cachedItems.Count),
// we need to synchronize access to this.enumerator
lock (enumerator)
{
// See if we have more cached items ...
if (currentIndex < this.cachedItems.Count)
{
var current = this.cachedItems[currentIndex];
currentIndex += 1;
yield return current;
}
else
{
// ... otherwise, we'll need to get the next item from this.enumerator.MoveNext()
if (this.enumerator.MoveNext())
{
// capture the current item and cache it, then increment the currentIndex
var current = this.enumerator.Current;
this.cachedItems.Add(current);
currentIndex += 1;
yield return current;
}
else
{
// We reached the end of the enumerator - we're done
yield break;
}
}
}
}
}
}
#endregion
#region IEnumerable Members
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
#endregion
}
I simply "lock (this.enumerator)" when the no more items appear to be in the cache, just in case another thread is just about to add another item (I assume that calling MoveNext() on this.enumerator from two threads is a bad idea).
The performance is great when retrieving previously cached items, but it starts to suffer when getting many items for the first time (due to the constant locking). Any suggestions for increasing the performance?
Thanks!