views:

653

answers:

4

List1 in the following example is a SortedList(Of MyClass) and contains 251 members.

The first two codeblocks execute in 15.5 seconds.

1:

 For cnt As Integer = 1 To 1000000
        For Each TempDE In List1
            Dim F As String = TempDE.Key
            TempDE.Value.x1 = 444
        Next
    Next

2:

    For cnt As Integer = 1 To 1000000
        For Each TempDE As KeyValuePair(Of String, phatob) In List2
            Dim F As String = TempDE.Key
            TempDE.Value.x1 = 444
        Next
    Next

This one executes in 5.6 seconds:

    For cnt As Integer = 0 To 999999
        For cnt2 As Integer = 0 To 250
            Dim F As String = List1.Keys(cnt2)
            List1.Values(cnt2).x1 = 444
        Next

    Next

Why are the first two codeblocks so much slower?

Thanks.

Edit: modified line 1 to read 251 members

+3  A: 

I tried to look around for some documentation on exactly how For Each behaves, but I couldn't find it.

My theory is that using the For Each statements copies the object in the list to another spot in memory and then copies it back into the list when each iteration of the loop ends.

Another possibility is that it calls the constructor at the start of every iteration and then deconstructs and calls the constructor again to reset for the next iteration.

I'm not sure on either of these theories, but the major difference between 3 and (1 or 2) is the lack of For Each.

EDIT: Found some documentation at MSDN.

Here's an excerpt:

When execution of the For Each...Next loop begins, Visual Basic verifies that group refers to a valid collection object. If not, it throws an exception. Otherwise, it calls the MoveNext method and the Current property of the enumerator object to return the first element. If MoveNext indicates that there is no next element, that is, if the collection is empty, then the For Each loop terminates and control passes to the statement following the Next statement. Otherwise, Visual Basic sets element to the first element and runs the statement block.

So overall it sounds like For Each is more "managed" and does a lot of overhead to make sure everything matches up. As a result, it's slower.

Zwergner
+3  A: 

I would think the compiler can better optimize block 3 because of the fixed loop range. In blocks one and 2 the compiler won't know what the upper limit of the loop is until it evaluates the List thus making it slower.

Ruel Waite
Definitely a valid point. I think it contributes, but maybe not the entire problem.
Zwergner
+2  A: 

My random guess: List1 contains ~750 elements (and not just 250). Your third case is faster, because it does not iterate over each element which List1 has.

I'm now leaning towards this as the correct answer, as my own tests show `For Each` being about the same speed as regular iteration. To further support this claim, you are iterating over 251 elements (0 to 250) without error, so the list does not contain the number of elements you think it does.
Zwergner
+3  A: 

The SortedList extends a Collection by implementing IComparer to provide the sort functionality. Internally, it implements 2 arrays to store the elements of the list - one array for the key and one for the values. The .NET Array is optimised for fast in-order and fast random access.

My suspicion why the first 2 are slow is because the foreach statement in a SortedList is a wrapper around the Enumerator. Calling foreach will query for the enumerator, call MoveNext and Current. In addition, going though the generic list can potentially involve boxing and unboxing as you traverse the list, and can create performance overhead that you would not normally get by accessing by Index.

Ash M
This is a better version of where I was trying to get with my answer :)
Zwergner