views:

1757

answers:

5

Imagine the following type:

public struct Account
{
    public int Id;
    public double Amount;
}

What is the best algorithm to synchronize two IList<Account> in C# 2.0 ? (No linq) ?

The first list (L1) is the reference list, the second (L2) is the one to synchronize according to the first:

  • All accounts in L2 that are no longer present in L1 must be deleted from L2
  • All accounts in L2 that still exist in L1 must be updated (amount attribute)
  • All accounts that are in L1 but not yet in L2 must be added to L2

The Id identifies accounts. It's no too hard to find a naive and working algorithm, but I would like to know if there is a smart solution to handle this scenario without ruining readability and perfs.

EDIT :

  • Account type doesn't matter, is could be a class, has properties, equality members, etc.
  • L1 and L2 are not sorted
  • L2 items could not be replaced by L1 items, they must be updated (field by field, property by property)
+3  A: 

For a start I'd get rid of the mutable struct. Mutable value types are a fundamentally bad thing. (As are public fields, IMO.)

It's probably worth building a Dictionary so you can easily compare the contents of the two lists. Once you've got that easy way of checking for presence/absence, the rest should be straightforward.

To be honest though, it sounds like you basically want L2 to be a complete copy of L1... clear L2 and just call AddRange? Or is the point that you also want to take other actions while you're changing L2?

Jon Skeet
I agree about mutable structs, and public fields. But Account type is not really important here - it could be a class, has encapsulation and equality members, etc. I'm focused on synchronizing two lists, by taking care of not replacing instances of L2 items by L1 items. L2 items have to be updated.
Romain Verdier
A: 

In addition to Jon Skeet's comment make your Account struct a class and override the Equals() and GetHashCode() method to get nice equality checking.

VVS
A: 

L2 = L1.clone()?

... but I would guess you forgot to mention something.

+2  A: 

If your two lists are sorted, then you can simply walk through them in tandem. This is an O(m+n) operation. The following code could help:

class Program
{
    static void Main()
    {
        List<string> left = new List<string> { "Alice", "Charles", "Derek" };
        List<string> right = new List<string> { "Bob", "Charles", "Ernie" };

        EnumerableExtensions.CompareSortedCollections(left, right, StringComparer.CurrentCultureIgnoreCase,
            s => Console.WriteLine("Left: " + s), s => Console.WriteLine("Right: " + s), (x,y) => Console.WriteLine("Both: " + x + y));
    }
}

static class EnumerableExtensions
{
    public static void CompareSortedCollections<T>(IEnumerable<T> source, IEnumerable<T> destination, IComparer<T> comparer, Action<T> onLeftOnly, Action<T> onRightOnly, Action<T, T> onBoth)
    {
        EnumerableIterator<T> sourceIterator = new EnumerableIterator<T>(source);
        EnumerableIterator<T> destinationIterator = new EnumerableIterator<T>(destination);

        while (sourceIterator.HasCurrent && destinationIterator.HasCurrent)
        {
            // While LHS < RHS, the items in LHS aren't in RHS
            while (sourceIterator.HasCurrent && (comparer.Compare(sourceIterator.Current, destinationIterator.Current) < 0))
            {
                onLeftOnly(sourceIterator.Current);
                sourceIterator.MoveNext();
            }

            // While RHS < LHS, the items in RHS aren't in LHS
            while (sourceIterator.HasCurrent && destinationIterator.HasCurrent && (comparer.Compare(sourceIterator.Current, destinationIterator.Current) > 0))
            {
                onRightOnly(destinationIterator.Current);
                destinationIterator.MoveNext();
            }

            // While LHS==RHS, the items are in both
            while (sourceIterator.HasCurrent && destinationIterator.HasCurrent && (comparer.Compare(sourceIterator.Current, destinationIterator.Current) == 0))
            {
                onBoth(sourceIterator.Current, destinationIterator.Current);
                sourceIterator.MoveNext();
                destinationIterator.MoveNext();
            }
        }

        // Mop up.
        while (sourceIterator.HasCurrent)
        {
            onLeftOnly(sourceIterator.Current);
            sourceIterator.MoveNext();
        }

        while (destinationIterator.HasCurrent)
        {
            onRightOnly(destinationIterator.Current);
            destinationIterator.MoveNext();
        }
    }
}

internal class EnumerableIterator<T>
{
    private readonly IEnumerator<T> _enumerator;

    public EnumerableIterator(IEnumerable<T> enumerable)
    {
        _enumerator = enumerable.GetEnumerator();
        MoveNext();
    }

    public bool HasCurrent { get; private set; }

    public T Current
    {
        get { return _enumerator.Current; }
    }

    public void MoveNext()
    {
        HasCurrent = _enumerator.MoveNext();
    }
}

You'll have to be careful about modifying the collections while iterating over them, though.

If they're not sorted, then comparing every element in one with every element in the other is O(mn), which gets painful really quickly.

If you can bear to copy the key values from each collection into a Dictionary or similar (i.e. a collection with acceptable performance when asked "is X present?"), then you could come up with something reasonable.

Roger Lipscombe
Well, you need to come up with something better than O(n*log(n) + m*log(m)), otherwise I'll just sort the lists and use your first approach :)
Tetha
A: 

I know this is an old post but you should check out AutoMapper. It will do exactly what you want in a very flexible and configurable way.

AutoMapper

rboarman