views:

129

answers:

3

We have two lists, let's say students and their scores. I want to compare these two lists and find the delta between the new list and the old list, then find the least intrusive way to Insert or Update into the new list any changes. What is the best algorithm to approach this? Want to focus on minimal amount of changes to new list and performance.

Example code:

List<ListItem> existingList = new List<ListItem>();
List<ListItem> newList = new List<ListItem>();

public TopLists()
{
  InitTwoLists();
}

private void InitTwoLists()
{
  existingList.Add(new ListItem { Name = "Shane", Score = 100 });
  existingList.Add(new ListItem { Name = "Mark", Score = 95 });
  existingList.Add(new ListItem { Name = "Shane", Score = 94 });
  existingList.Add(new ListItem { Name = "Steve", Score = 90 });
  existingList.Add(new ListItem { Name = "Brian", Score = 85 });
  existingList.Add(new ListItem { Name = "Craig", Score = 85 });
  existingList.Add(new ListItem { Name = "John", Score = 82 });
  existingList.Add(new ListItem { Name = "Steve", Score = 81 });
  existingList.Add(new ListItem { Name = "Philip", Score = 79 });
  existingList.Add(new ListItem { Name = "Peter", Score = 70 });

  newList.Add(new ListItem { Name = "Shane", Score = 100 });
  newList.Add(new ListItem { Name = "Steve", Score = 96 });  // This is change
  newList.Add(new ListItem { Name = "Mark", Score = 95 });
  newList.Add(new ListItem { Name = "Shane", Score = 94 });
  newList.Add(new ListItem { Name = "Brian", Score = 85 });
  newList.Add(new ListItem { Name = "Craig", Score = 85 });
  newList.Add(new ListItem { Name = "John", Score = 82 });
  newList.Add(new ListItem { Name = "Steve", Score = 81 });
  newList.Add(new ListItem { Name = "Philip", Score = 79 });
  newList.Add(new ListItem { Name = "Peter", Score = 70 });
}
}

public void CompareLists()
{
  // How would I find the deltas and update the new list with any changes from old?
}
}

public class ListItem
{
  public string Name { get; set; }
  public int Score { get; set; }
}

** EDIT: Desired Output *

The desired output is to actually change the newList with the deltas. For example in this scenario:

newList.Add(new ListItem { Name = "Shane", Score = 100 });
  newList.Add(new ListItem { Name = "Steve", Score = 96 });  // This is change
  newList.Add(new ListItem { Name = "Mark", Score = 95 });
  newList.Add(new ListItem { Name = "Shane", Score = 94 });
  newList.Add(new ListItem { Name = "Brian", Score = 85 });
  newList.Add(new ListItem { Name = "Craig", Score = 85 });
  newList.Add(new ListItem { Name = "John", Score = 82 });
  newList.Add(new ListItem { Name = "Steve", Score = 81 });
  newList.Add(new ListItem { Name = "Roger", Score = 80 });  // Roger is a new entry
  newList.Add(new ListItem { Name = "Phillip", Score = 79 });  // Philip moved down one

// Peter drops off this list with his score of 70, since I only want the top 10.

So the changes would be:

Update record 2 for "Steve", the score has changed Insert new record "Roger" at position 9 Drop record for "Peter" off of the top 10.

+2  A: 

Can you use Linq:

List<ListItem> list1 = GetYourList1();
List<ListItem> list2 = GetYourList2();
var diff = list1.Except(list2);

Your example specific:

var diff = newList.Except(existingList);

Not sure if it is the most efficient but it's concise :)

Kelsey
However, the OP is asking for an **algorithm**, not an existing solution.
Oded
You may have to try this out, it should work in theory but in practice it will depend on how ListItem implements the equality comparer. Might be better to create a Student class and override Equals.
Dave Swersky
How about items that are in list2 but not list1? I think this way only detects newly added elements. In addition, you also need to implement an appropriate implementation for the `IEquatable<T>` interface.
0xA3
@Oded: The title says "pattern/algorithm." I think LINQ qualifies as a pattern.
Dave Swersky
@Dave Swersky - LINQ is a set of query language extensions, not a pattern.
Oded
I tried this statement. What should the value of diff be? I tried it and diff contained all 10 items, when really I expected it to only return the one changed value. I was expected to only see one record returned: Name = "Steve", Score = 96 since his score changed from 81 in the existing list to 96.
Shane
@Shane: Did you implement `IEquatable<T> on your `ListItem` class? This is required by `Except` in order to compare the actual properties of a `ListItem`. The implementation would simply be: `return object.Equals(this.Name, other.Name) `
0xA3
No, I didn't add that. Let me try to add IEquatable<T>. Thanks.
Shane
@Shane sorry, answered and didn't monitor the question. Your objects need to implement `IEquatable` for the items being compared. Also based on the other comments I didn't realize you wanted to reinvent the wheel with your own algorithm. Is this the case or did you just want a way of getting the task done?
Kelsey
@Kelsey Ultimately, I want to get the task done. I just figured there was a nice existing algo out there to 1) find the differences (if any) and 2) Update the existing list in the least intrusive way. The most crude approach is just to delete all records from existing list and replace with new list. I just want to be more elegant and only do changes where needed. The final implementation for this will be me comparing this new list to an existing SharePoint custom list and I want to make the least amount of changes possible. i.e. Insert a new record at position 6, update the score for item
Shane
I ended up using this LINQ query to find all the deltas, then took action based on that list. First updated any existing records, then inserted any new records not in existing list, Then trimmed everything that was more than 10 in the list. Thanks everyone for such great responses!
Shane
+1  A: 

If you're looking for a general, language-agnostic solution, then you're looking for some kind of data synchronization of ordered lists. The basic algorithm would be:

i1 = 0
i2 = 0
while (i1 < list1.size && i2 < list2.size)
{
  if (list1[i1] < list2[i2])
  {
    //list1[i1] is not in list2
    i1++
  }
  else if (list1[i1] > list2[i2])
  {
    //list2[i2] is not in list1
    i2++
  }
  else
  {
    //item is in both lists
    i1++
    i2++
  }
}
if (i1 < list1.size)
   //all remaining list1 items are not in list2
if (i2 < list2.size)
   //all remaining list2 items are not in list1
Bob
+1  A: 

This should be able to do the trick if you don't have the same name twice in your list. In your example, you have 2x Steve, but you need a way to distinct between them.

public static List<ListItem> CompareLists(List<ListItem> existingList, List<ListItem> newList)
{
    List<ListItem> mergedList = new List<ListItem>();
    mergedList.AddRange(newList);
    mergedList.AddRange(existingList.Except(newList, new ListItemComparer()));
    return mergedList.OrderByDescending(x => x.Score).Take(10).ToList();
}

public class ListItemComparer : IEqualityComparer<ListItem>
{
    public bool Equals(ListItem x, ListItem y)
    {
        return x.Name == y.Name;
    }

    public int GetHashCode(ListItem obj)
    {
        return obj.Name.GetHashCode();
    }
}

You can call it like this:

newList = CompareLists(existingList, newList);
Tom Vervoort