tags:

views:

67

answers:

4

I have a List of type string in a .NET 3.5 project. The list has thousands of strings in it, but for the sake of brevity we're going to say that it just has 5 strings in it.

List<string> lstStr = new List<string>() {
            "Apple", "Banana", "Coconut", "Coconut", "Orange"};

Assume that the list is sorted (as you can tell above). What I need is a LINQ query that will remove all strings that are not duplicates. So the result would leave me with a list that only contains the two "Coconut" strings.

Is this possible to do with a LINQ query? If it is not then I'll have to resort to some complex for loops, which I can do, but I didn't want to unless I had to.

Thanks in advance for any help that I receive on this!

+4  A: 

var dupes = lstStr.Where(x => lstStr.Sum(y => y==x ? 1 : 0) > 1);

OR

var dupes = lstStr.Where((x,i) => (   (i > 0 && x==lstStr[i-1]) 
                                   || (i < lstStr.Count-1 && x==lstStr[i+1]));

Note that the first one enumerates the list for every element which takes O(n²) time (but doesn't assume a sorted list). The second one is O(n) (and assumes a sorted list).

Mark Cidade
A: 
var temp = new List<string>();

foreach(var item in list)
{
    var stuff = (from m in list
                 where m == item
                 select m);
    if (stuff.Count() > 1)
    {
        temp = temp.Concat(stuff);
    }
}
Scott M.
+1  A: 

This should work, and is O(N) rather that the O(N^2) of the other answers. (Note, this does use the fact that the list is sorted, so that really is a requirement).

IEnumerable<T> OnlyDups<T>(this IEnumerable<T> coll) 
   where T: IComparable<T>
{
     IEnumerator<T> iter = coll.GetEnumerator();
     if (iter.MoveNext())
     {
         T last = iter.Current;
         while(iter.MoveNext())
         {
             if (iter.Current.CompareTo(last) == 0)
             {
                  yield return last;
                  do 
                  {
                       yield return iter.Current;
                  }
                  while(iter.MoveNext() && iter.Current.CompareTo(last) == 0);
             }
             last = iter.Current;
         }
}

Use it like this:

IEnumerable<string> onlyDups = lstStr.OnlyDups();

or

List<string> onlyDups = lstStr.OnlyDups().ToList();
James Curran
This doesn't use linq?
McKay
@McKay: Yes, but the OP stated that it can be assumed that the list is sorted.
James Curran
@McKey (revised question): technically no, but it does maintain a linq-style interface, and can be used as part of a larger LINQ statement.
James Curran
+2  A: 

here is code for finding duplicates form string arrya

int[] listOfItems = new[] { 4, 2, 3, 1, 6, 4, 3 };
var duplicates = listOfItems
    .GroupBy(i => i)
    .Where(g => g.Count() > 1)
    .Select(g => g.Key);
foreach (var d in duplicates)
    Console.WriteLine(d);
Pranay Rana