tags:

views:

3821

answers:

5

I'm trying to merge 2 lists using "Union" so I get rid of duplicates. Following is the sample code:

public class SomeDetail
{
    public string SomeValue1 { get; set; }
    public string SomeValue2  { get; set; }
    public string SomeDate { get; set; }
}

public class SomeDetailComparer : IEqualityComparer<SomeDetail>
{
    bool IEqualityComparer<SomeDetail>.Equals(SomeDetail x, SomeDetail y)
    {
        // Check whether the compared objects reference the same data.        
        if (Object.ReferenceEquals(x, y))
            return true;
        // Check whether any of the compared objects is null.        
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;
        return x.SomeValue1 == y.SomeValue1 && x.SomeValue2 == y.SomeValue2;
    }
    int IEqualityComparer<SomeDetail>.GetHashCode(SomeDetail obj)
    {
        return obj.SomeValue1.GetHashCode();
    }
}

List<SomeDetail> tempList1 = new List<SomeDetail>();
List<SomeDetail> tempList2 = new List<SomeDetail>();

List<SomeDetail> detailList = tempList1.Union(tempList2, SomeDetailComparer).ToList();

Now the question is can I use Union and still get the record which has the latest date (using SomeDate property). The record itself can either be in tempList1 or tempList2.

Thanks in advance

+1  A: 

You'd have to be able to tell Union how to pick which one of the duplicates to use. I don't know of a way to do that other than writing your own Union.

John Weldon
+4  A: 

The operation that is really suited to this purpose is an full outer join. The Enumerable class has an implementation of inner join, which you can use to find the duplicates and select whichever you prefer.

var duplicates = Enumerable.Join(tempList1, tempList2,  keySelector, keySelector, 
    (item1, item2) => (item1.SomeDate > item2.SomeDate) ? item1 : item2)
    .ToList();

keySelector is simply a function (could be a lambda expression) that extracts a key from an object of type SomeDetail. Now, to implement the full outer join, try something like this:

var keyComparer = (SomeDetail item) => new { Value1 = item.SomeValue1,
    Value2 = item.SomeDetail2 };
var detailList = Enumerable.Union(tempList1.Except(tempList2, equalityComparer), 
    tempList2.Except(tempList1, equalityComparer)).Union(
    Enumerable.Join(tempList1, tempList2, keyComparer, keyComparer
    (item1, item2) => (item1.SomeDate > item2.SomeDate) ? item1 : item2))
    .ToList();

equalityComparer should be an object that implements IEqualityComparer<SomeDetail> and effectively uses the keyComparer function for testing equality.

Let me know if that does the job for you.

Noldorin
I used a unique value in the SomeDetail class for the selector but isn't returing any records. Any help Please?var detailList = Enumerable.Join(tempList1, tempList2,item1 => item1.UniqueKey, item2 => item2.UniqueKey, (item1, item2) => (item1.SomeDate > item2.SomeDate) ? item1 : item2) .ToList();
Ganesha
@Ganesha: Have you verified that there *are* at least some items with identical UniqueKey values?
Noldorin
Oh! The lists could have totally diferent values. If there is a matching value then, the date has to be taken into account to decide which one will be selectedIf there is no matching value then the value will still have to copied over. (just like in the case of Union)
Ganesha
Noldorin
The new code with the key comparer does not return any records? Any help please?
Ganesha
What are your SomeValue1 and SomeValue2 properties? We're going to need a few more specifics if we're to help.
Noldorin
Func<UsageDetail, string> keyComparer = (UsageDetail item) => item.SiteUrlsiteUsageDetails = Enumerable.Join(siteUsageDetails, tempUsageDetails, keyComparer, keyComparer, (item1, item2) => (Convert.ToDateTime(item1.LastActivityDate) > Convert.ToDateTime(item2.LastActivityDate)) ? item1 : item2) .ToList();This is exactly what I am trying to accomplish
Ganesha
@Ganesha: Sorry, it seems I was giving you code for the wrong type of join! You actually want a *full outer join*. See my updated answer and check if that does the job.
Noldorin
Like this idea actually but this one does not return, probably goes into an infinite loop
Ganesha
Sorry, my problem. The problem was with the equality comparer. Now it seems to be working. I will have to take a closer look. Thanks
Ganesha
Ok, good to know you're having some success now...
Noldorin
Kind of getting close. When I do a merge of those Lists, the behavior is kind of random (i.e.) I see different dates each time I run
Ganesha
@Ganesha: Are you sure you're not just seeing the same set, but reordered each time you run the code?
Noldorin
I'm actually seeing different output everytime and is not consistent
Ganesha
@Ganesha: Not sure what to suggest now, really. Have you verified that your data sources (tempList1 and tempList2) are identical on each run?
Noldorin
Yes, the lists have the same data. The merged results are perfect too (i.e.) they are unique. It's just that the dates vary everytime I run. I need the record with the latest date to show up in the final merged list
Ganesha
Works now!!!! Thanks a lot!!!!
Ganesha
@Ganesha: Heh, that was a bit of a battle! Glad it's working now though. :)
Noldorin
+1  A: 

You cannot with the standard Union method, but you can create an extension method Union for List<SomeDetail> with this special handling and this method will be used because the signature fits better.

Daniel Brückner
+1  A: 

Why not just use HashSet<T>?

List<SomeDetail> tempList1 = new List<SomeDetail>();
List<SomeDetail> tempList2 = new List<SomeDetail>();

HashSet<SomeDetail> hs = new HashSet<SomeDetail>(new SomeDetailComparer());

hs.UnionWith(tempList1);
hs.UnionWith(tempList2);

List<SomeDetail> detailList = hs.ToList();
Chris Doggett
This would work if I need unique records. I have an additional requirement where on merge I need the record which has the latest date between the 2 lists.
Ganesha
A: 

Merge generic lists

    public static List<T> MergeListCollections<T>(List<T> firstList, List<T> secondList)
    {
        List<T> merged = new List<T>(firstList);
        merged.AddRange(secondList);
        return merged;
    }
JeremySpouken