There is no built-in "Linqy" way (you could group, but it would be pretty inefficient), but that doesn't mean you can't make your own way:
public static IEnumerable<T> TakeDistinctByKey<T, TKey>(
this IEnumerable<T> source,
Func<T, TKey> keyFunc,
int count)
{
if (keyFunc == null)
throw new ArgumentNullException("keyFunc");
if (count <= 0)
yield break;
int currentCount = 0;
TKey lastKey = default(TKey);
bool isFirst = true;
foreach (T item in source)
{
yield return item;
TKey key = keyFunc(item);
if (!isFirst && (key != lastKey))
currentCount++;
if (currentCount > count)
yield break;
isFirst = false;
lastKey = key;
}
}
Then you can invoke it with this:
var items = cache.TakeDistinctByKey(rec => rec.Id, 20);
If you have composite keys or anything like that you could easily extend the method above to take an IEqualityComparer<TKey>
as an argument.
Also note that this depends on the elements being in sorted order by key. If they aren't, you could either change the algorithm above to use a HashSet<TKey>
instead of a straight count and last-item comparison, or invoke it with this instead:
var items = cache.OrderBy(rec => rec.Id).TakeDistinctByKey(rec => rec.Id, 20);
Edit - I'd also like to point out that in SQL I would either use a ROW_NUMBER
query or a recursive CTE, depending on the performance requirement - a distinct+join is not the most efficient method. If your cache is in sorted order (or if you can change it to be in sorted order) then the method above will be by far the cheapest in terms of both memory and execution time.