Timwi/Darin's suggestion will perform this in a single pass over the original collection, but it will create multiple buffers for the groupings. LINQ isn't really very good at doing this kind of counting, and a problem like this was my original motiviation for writing Push LINQ. You might like to read my blog post on it for more details about why LINQ isn't terribly efficient here.
Push LINQ and the rather more impressive implementation of the same idea - Reactive Extensions - can handle this more efficiently.
Of course, if you don't really care too much about the extra efficiency, go with the GroupBy
answer :)
EDIT: I hadn't noticed that your strings were ordered. That means you can be much more efficient, because you know that once you've seen string x and then string y, if x and y are different, you'll never see x again. There's nothing in LINQ to make this particularly easier, but you can do it yourself quite easily:
public static IDictionary<string, int> CountEntries(IEnumerable<string> strings)
{
var dictionary = new Dictionary<string, int>();
using (var iterator = strings.GetEnumerator())
{
if (!iterator.MoveNext())
{
// No entries
return dictionary;
}
string current = iterator.Current;
int currentCount = 1;
while (iterator.MoveNext())
{
string next = iterator.Current;
if (next == current)
{
currentCount++;
}
else
{
dictionary[current] = currentCount;
current = next;
currentCount = 1;
}
}
// Write out the trailing result
dictionary[current] = currentCount;
}
return dictionary;
}
This is O(n), with no dictionary lookups involved other than when writing the values. An alternative implementation would use foreach
and a current
value starting off at null... but that ends up being pretty icky in a couple of other ways. (I've tried it :) When I need special-case handling for the first value, I generally go with the above pattern.
Actually you could do this with LINQ using Aggregate
, but it would be pretty nasty.