tags:

views:

48

answers:

3
+1  Q: 

Keeping Duplicates

I have an IEnumerable containing objects that have a groupnumber property. I want to be able to get a list of all objects that have duplicate groupnumbers e.g.

obj1: groupnumber=1 KEEP
obj2: groupnumber=2 DELETE
obj3: groupnumber=1 KEEP

I can use the following to get a list of all the duplicated groupnumbers

   var duplicates = from c in sorted 
                    group c by c.groupnumber into g 
                    where g.Count() > 1 
                    select new { groupnumber = g.Key, recs = g.Count() };

but I cant figure out how to get a list cleaned of all the single instance objects

Cheers

+1  A: 

Here's the simplest option (I think):

sorted.GroupBy( c => c.groupnumber )
      .Where( g => g.Count() > 1 )
      .SelectMany( g => g );

Alternatively, try the following:

var duplicates = from c in sorted 
                 group c by c.groupnumber into g 
                 where g.Count() > 1
                 select g.Key;

// convert the list to a lookup object for efficiency
var dupLookup = duplicates.ToLookup( c => c );

// exclude all items that are NOT one of the duplicate group keys
var excludeNonDups = sorted.Where( c => !dupLookup.Contains( c ) )
LBushkin
+1  A: 

Alright, I had to read your question a few times. My understanding is that you want to "select all the objs where there are more than one obj in the collection with the same groupnumber"... so filter out the ones with unique groupnumbers.

If that's the case, you're almost there! Use SelectMany to collapse the groups into a single collection.

var duplicates = (from c in sorted
    group c by c.groupnumber into g
    where g.Count() > 1
    select g).SelectMany(grp => grp);
Sapph
Thanks, I just found he following which woks and is close to yoursvar dupe = from dr in sorted group dr by dr.groupnumber into grouped from dr in grouped.Skip(1) select dr;
gary proudfoot
I think that one only works by chance (the first group is probably the one that has no duplicates). You should definitely keep the `where` clause you had previously. :)
Sapph
A: 

Add in a call to Distinct() if you only want one of each of the duplicates:

var duplicates = (from c in sorted 
                  group c by c.groupnumber into g 
                  where g.Count() > 1 
                  select new { groupnumber = g.Key, recs = g.Count() }).Distinct();
Frode N. Rosand