views:

115

answers:

2

Lets say we have a generic list of Class1, typically having ~100 objects for a given session. I would like to see if the list has a particular object. ASP.NET 2.0 allows me to do this:

Dim objResult as Class1 = objList.Find(objSearch)

How does this approach rate when compared to a traditional For loop, looking at a performance perspective?

How would this vary with increase or decrease in length of the list?

+6  A: 

It's exactly the same as looping - that's what it does internally.

If your list is sorted and you're looking for a particular value, you could potentially use BinarySearch instead - but if you think about it, if you don't know anything about the predicate or the order of the data, it has to look through each item until it finds a match. As it happens, Find is specified to return the first item in the list... so it just looks through in the obvious order.

This will be linear with the size of the list - i.e. on average, searching a list which is 10 times as big will take 10 times as long. Of course, it depends on whether and where a match is found; if the first item matches in every case, it'll finish in the same time no matter how big the list is :)

To be honest, with only 100 objects it sounds like it's very unlikely to be a bottleneck.

Jon Skeet
+3  A: 

You can easily see how .Net List implements the Find method using Reflector:

Public Function Find(ByVal match As Predicate(Of T)) As T
    If (match Is Nothing) Then
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match)
    End If
    Dim i As Integer
    For i = 0 To Me._size - 1
        If match.Invoke(Me._items(i)) Then
            Return Me._items(i)
        End If
    Next i
    Return CType(Nothing, T)
End Function

The only difference between both implementations is that Find requires a call to match instead of having this logic inlined in the loop.

Interestingly, this simple performance:

var persons = new List<Person>();
for (int i = 0; i < 100; i++)
{
    persons.Add(new Person { ID = i });
}

GC.Collect();
var sw = Stopwatch.StartNew();
for (int i = 0; i < 10000000; i++)
{
    persons.Find(person => person.ID == i % 100);

}
sw.Stop();
Console.WriteLine(sw.Elapsed);

GC.Collect();
sw = Stopwatch.StartNew();
for (int i = 0; i < 10000000; i++)
{
    for (int j = 0; j < 100; j++)
    {
        if (persons[j].ID == i % 100)
        {
            break;
        }
    }
}
sw.Stop();
Console.WriteLine(sw.Elapsed);

Shows that:
Total time required to query the list using Find is 05.7990078 seconds.
Total time required to query the list using loop is 06.3551074 seconds.

This results seems consistent in several executions.

Edit - Found explanation for the advantage of Find:
Find works faster since it accesses directly the underlying array in each iteration. The loop access it through the List indexer, which requires for every access index verification:

Public Default Property Item(ByVal index As Integer) As T
    Get
        If (index >= Me._size) Then
            ThrowHelper.ThrowArgumentOutOfRangeException
        End If
        Return Me._items(index) // _items is the underlying array.
    End Get
Elisha
Well well well.. So the Predicate does work faster.. I wonder why..Btw.. is there a reason for those 'GC.Collect()'s ? I mean, I never use it, and as far as I know, it should be used only when we're sure that we have significant unmanaged objects in the memory. I know this is out of the scope of this question.
Srikanth Venugopalan
@Srikanth, I wonder why too, I expected the loop to work faster. I am still trying to find out why.The calls to `GC` are here to reduce the chance that the garbage collector will work while we measure the duration. In regular code it's not required :)
Elisha
@Srikanth, edited my answer with explanation to Find advantage
Elisha
+1 for discovering the advantage of Find and the reason for such.
Srikanth Venugopalan