tags:

views:

94

answers:

2

I have a .csv file (words.csv) containing 5000 words seperated by commas. Most of the strings are repeated values.

Can I use LINQ to do the following:

A. group common words together and show count of repeated words

so if apple has been repeated 5 times and banana 3 times..it should display as

apple - 5
banana - 3 and so on

B. Create another text file with duplicates removed.

+1  A: 

There is a distinct keyword in Linq that you could use.

http://www.shawson.co.uk/codeblog/linq-distinct/

Shiraz Bhaiji
+3  A: 

Sure, here's the LINQ syntax in C#:

from word in words
group word into occurrences
select new
{
    Word = occurrences.Key,
    Count = occurrences.Count()
}

Or in "pure" C# method calls:

words.GroupBy(w => w)
     .Select(o => new 
                  { 
                     Word = o.Key,
                     Count = o.Count()
                  });

And to create a distinct list of words you just use the Distinct operator:

words.Distinct();
Drew Marsh
+1 Just what I was looking for...
jasonco