views:

503

answers:

4

Let me start by saying I've read these questions: 1 & 2, and I understand that I can write the code to find duplicates in my List, but my problem is I want to update the original list not just query and print the duplicates.

I know I can't update the collection the query returns as it's not a view, it's an anonymous type IEnumerable<T>.

I want to be able to find duplicates in my list, and mark a property I've created called State which is used later in the application.

Has anyone ran into this problem and can you point me in the right direction?

p.s. The approach I'm using ATM is a bubble sort type loop to go through the list item by item and compare key fields. Obviously this isn't the fastest method.

EDIT:

In order to consider an item in the list a "duplicate", there are three fields which must match. We'll call them Field1, Field2, and Field3

I have an overloaded Equals() method on the base class which compares these fields.

The only time I skip an object in my MarkDuplicates() method is if the objects state is UNKNOWN or ERROR, otherwise, I test it.

Let me know if you need more details.

Thanks again!

A: 

Your objects have some sort of state property. You're presumably finding duplicates based on another property or set of properties. Why not:

List<obj> keys = new List<object>();

foreach (MyObject obj in myList)
{
    if (keys.Contains(obj.keyProperty))
        obj.state = "something indicating a duplicate here";
    else
        keys.add(obj.keyProperty)
}
Chris
This, except that if you have a lot of objects you should use a HashSet for "keys" instead of a List.
mquander
+3  A: 

I think the easiest way is to start by writing an extension method which find's duplicates in a list of objects. Since you're objects use .Equals() they can be compared in most common collections.

public static IEnumerable<T> FindDuplicates<T>(this IEnumerable<T> enumerable) {
  var hashset = new HashSet<T>();
  foreach ( var cur in enumerable ) { 
    if ( !hashset.Add(cur) ) {
      yield return cur;
    }
  }
}

Now it should be pretty easy to update your collection for duplicates. For instance

List<SomeType> list = GetTheList();
list
  .FindDuplicates()
  .ToList()
  .ForEach(x => x.State = "DUPLICATE");

If you already have a ForEach extentsion method defined in your code, you can avoid the .ToList.

JaredPar
@JaredPar: Thanks very much for your help.
Chris
A: 
IEnumerable<T> oldList;
IEnumerable<T> list;

foreach (var n in oldList.Intersect(list))
   n.State = "Duplicate";

Edit: I need to lrn2read. this code is for 2 lists. My bad.

Chad Grant