tags:

views:

167

answers:

4

Hello all.

I'm wondering if my method below of comparing an array of strings (or any simple type really) has any performance repercussions.

bool AreValuesEqual(List<string> oldFieldValue, List<string> newFieldValue)
    {
        if (oldFieldValue.Count != newFieldValue.Count)
            return false;

        var list1 = oldFieldValue;
        list1.AddRange(newFieldValue);
        var list2 = list1.Distinct();
        return list2.Count() == newFieldIds.Count;
    }

I don't know how intensive Distinct() will be for this, but I think it shouldn't be too much compared to another loop.

Edit - Sorry, should have provided more background info. Couple things:

-There will be no duplicates in the parameter arrays.

-I'm not really concerned about order, I just want to know if the values in one array is the same as the other array. If the other array has a different value, return false.

+5  A: 

I don't think there's a significant performance issue with your code. However, what bothers me a little is that you're modifying list1 as a side-effect of doing the comparison.

Will the following work better?

if(list1.Count == list2.Count)
{
    var list3 = list1.Intersect(list2);
    return list3.Count == list1.Count();
}

A similar question was posted at http://stackoverflow.com/questions/675699/c-compare-two-lists-for-differences.

David Andres
Nice. I'll give this a try.
johnofcross
A: 

I'm not sure if this is any faster but you might give the code here a try: http://blog.slaven.net.au/archives/2008/03/16/comparing-two-arrays-or-ienumerables-in-c/

It looks like it's gone through several revisions. And several people have collaborated on it which is usually a good sign.

Steve Wortham
I did see this before I posted. I considered using it, but I just thought maybe above is a quick and dirty version. :)
johnofcross
+1  A: 

Your function mutates the list that is passed to it as oldFieldValue (by calling AddRange on it - note that list1 is another reference to the same list!). You need to make a real copy there, e.g. via .ToList().

On the whole, from your method, it seems that you're trying to define "set equality" - i.e. treat lists as equal if they contain the same elements, disregarding duplicates and order. If so, a far simpler way to do this would be to use Enumerable.Except():

if (!oldFieldValue.Except(newFieldValue).Any())
{
     // no difference
}

On the other hand, if you're just using Distinct() there for the sake of it, and you do want to take ordering and duplicates into account (or you can always guarantee that input sequences are ordered and have no dupes), then Enumerable.SequenceEqual() is the best choice here.

Pavel Minaev
+1 on the mutation call out. Thanks!
johnofcross
A: 

I think the distinct count is not doing a meaningful comparision. Imagine the following input:

oldFieldValue = {"A","HAPPY","HAPPY","WORLD"};
newFieldValue = {"A","HAPPY","HAPPY","WORLD"};

Are these equal?

If order is important, then a simple for loop through both lists will determine equality.

If order is unimportant, sort the new array (the old array will naturally always be sorted), and then do a for loop through the list.

Will
Those are not equal because the first check I do in the method is to see if the array are the same size. Above would return false before the I begin to compare the array.
johnofcross
Fixed. Are they still not equal in your code?
Will