ansaurus

Question

Find object data duplicates in List of objects

Answer 1

A:

well if you implement IComparable like so:

int IComparable<Person>.CompareTo(Person person)
{
    return this.SSN.CompareTo(person.SSN);
}

then a comparison like the following will work:

for (Int32 i = 0; i < people.Count; i++)
{
    for (Int32 j = 1; j < items.Count; j++)
    {
        if (i != j && items[i] == items[j])
        {
            // duplicate
        }
    }
}

SnOrfus 2009-03-06 17:11:43

Answer 2

+6 A:

This gets you the duplicated SSN:

var duplicatedSSN =
    from p in persons
    group p by p.SSN into g
    where g.Count() > 1
    select g.Key;

The duplicated list would be like:

duplicated = persons.FindAll(p => duplicatedSSN.Contains(p.SSN));

And then just iterate over the duplicates and remove them.

duplicated.ForEach(dup => persons.Remove(dup);

gcores 2009-03-06 17:13:49

Your solution was close. The line `duplicated = persons.FindAll(duplicatedSSN.Contains(p => p.SSN);` did not work. See my answer to see what I corrected to get to the answer.

Chris Conway 2009-03-06 18:38:50

Answer 3

A:

Traverse the list and keep a Hashtable of SSN/count pairs. Then enumerate your table and remove the items that match SSNs where SSN count > 0.

Dictionary<string, int> ssnTable = new Dictionary<string, int>();

foreach (Person person in persons)
{
   try
   {
      int count = ssnTable[person.SSN];
      count++;
      ssnTable[person.SSN] = count;
   }
   catch(Exception ex)
   {
       ssnTable.Add(person.SSN, 1);
   }
}

// traverse ssnTable here and remove items where value of entry (item count) > 1

mjmarsh 2009-03-06 17:14:56

Answer 4

A:

List<Person> actualPersons = persons.Distinct().ToList();
List<Person> duplicatePersons = persons.Except(actualPersons).ToList();

Graeme Bradbury 2009-03-06 17:41:07

This did not work since Distinct looks at all of the data. I just want to compare SSN and look for dupes on that one field.

Chris Conway 2009-03-06 18:31:53

Answer 5

A:

Thanks to gcores for getting me started down a correct path. Here's what I ended up doing:

var duplicatedSSN =
    from p in persons
    group p by p.SSN into g
    where g.Count() > 1
    select g.Key;

var duplicates = new List<Person>();

foreach (var dupeSSN in duplicatedSSN)
{
    foreach (var person in persons.FindAll(p => p.SSN == dupeSSN))
        duplicates.Add(person);
}

duplicates.ForEach(dup => persons.Remove(dup));

Chris Conway 2009-03-06 18:35:56

Sorry, the line was wrong. It should have said duplicated = persons.FindAll(p => duplicatedSSN.Contains(p.SSN));. I've edited the answer.

gcores 2009-03-06 19:57:32

ansaurus

tags:

views:

answers:

Find object data duplicates in List of objects

related questions