views:

210

answers:

7
List<int> list = ...

for(int i = 0; i < list.Count; ++i)
{
           ...
}

So does the compiler know the list.Count does not have to be called each iteration?

+20  A: 

Are you sure about that?

List<int> list = new List<int> { 0 };

for (int i = 0; i < list.Count; ++i)
{
    if (i < 100)
    {
        list.Add(i + 1);
    }
}

If the compiler cached the Count property above, the contents of list would be 0 and 1. If it did not, the contents would be the integers from 0 to 100.

Now, that might seem like a contrived example to you; but what about this one?

List<int> list = new List<int>();

int i = 0;
while (list.Count <= 100)
{
    list.Add(i++);
}

It may seem as if these two code snippets are completely different, but that's only because of the way we tend to think about for loops versus while loops. In either case, the value of a variable is checked on every iteration. And in either case, that value very well could change.

Typically it's not safe to assume the compiler optimizes something when the behavior between "optimized" and "non-optimized" versions of the same code is actually different.

Dan Tao
This makes me miss the good old days of C++ `const`.
Steven Sudit
+1 for the nice example :)
Reed Copsey
IIRC, the first example would not compile, you cant modify a list while iterating over it.
Neil N
@Neil, that would be enumerating rather than accessing by index?
Phil
@Neil: That's the mistake the OP made, thinking of a `for` loop as the same as iterating. It might always be *used* that way in practice, but that's not really what it *is* by definition. It's essentially a `while` loop. You're probably thinking of `foreach`, which *is* by definition an enumeration. That will throw an exception because `List<T>.Enumerator.MoveNext` throws an `InvalidOperationException` if the list has been modified between calls. Anyway, the `for` loop above does indeed compile; try it out yourself.
Dan Tao
The example definitely works and compiles with no warnings.
AlfredBr
I probably was thiking of foreach
Neil N
@Neil: Yeah, totally understandable. By the way, though, even that scenario (an exception being thrown if a `List<T>` is modified from within a `foreach`) is specific to the `List<T>.Enumerator` type; it isn't an inherent feature of all collections. (Probably just about all the collections in the BCL, maybe, but not *all* collections.) An `IEnumerator<T>` implementation has to be specifically coded to throw an exception on `MoveNext`; if it isn't, there's nothing inherently illegal about modifying a collection while enumerating over it. And either way, the code would definitely *compile*.
Dan Tao
@Karmastan: What makes you so sure Vitaliy's `Count` is constant? He supplied no code inside the loop! That was actually my point: the `List<T>.Count` property *may* change; it may not. The examples provided in the article cited by CRMay all involve arrays, which are *guaranteed* to be of constant length. Yes, I suppose it's possible that with a considerable amount of wizardry the compiler could detect whether a `List<T>` were modified from within a `for` loop, but I haven't seen that documented. The main point I wanted to make to the OP was that this "optimization" is not always appropriate.
Dan Tao
+8  A: 

The C# compiler does not do any optimizations like this. The JIT compiler, however, optimizes this for arrays, I believe (which are not resizable), but not for lists.

A List's count property can change within the loop structure, so it would be an incorrect optimization.

Reed Copsey
A: 

No, it doesn't. Because condition is calculated on each step. It can be more complex than just comparsion with count, and any boolean expression is allowed:

 for(int i = 0; new Random().NextDouble() < .5d; i++)
     Console.WriteLine(i);

http://msdn.microsoft.com/en-us/library/aa664753(VS.71).aspx

STO
A: 

It depends on the particular implementation of Count; I've never noticed any performance issues with using the Count property on a List so I assume it's ok.

In this case you can save yourself some typing with a foreach.

List<int> list = new List<int>(){0};
foreach (int item in list)
{ 
    // ...
} 
Phil
Unless you use the index `i` somewhere in the loop body. Something as simple as adding one to every element can't be done with `foreach`.
Ben Voigt
The OP doesn't mention adding something to every element?
Phil
+1  A: 

For all the other commenters who say that the 'Count' property could change in a loop body: JIT optimizations let you take advantage of the actual code that's running, not the worst-case of what might happen. In general, the Count could change. But it doesn't in all code.

So in the poster's example (which might not have any Count-changing), is it unreasonable for the JIT to detect that the code in the loop doesn't change whatever internal variable List uses to hold its length? If it detects that list.Count is constant, wouldn't it lift that variable access out of the loop body?

I don't know if the JIT does this or not. But I am not so quick to brush this problem off as trivially "never."

Karmastan
This could be arbitrarily difficult, as you may have a reference to the list on some other thread, which is then modified while you're in the loop on the current thread. Granted, this would lead to a whole host of other problems, but the core issue is that the list is mutable and therefore you can't guarantee the state of Count from one iteration to another. This is true even if you ignore threading (should the jitter examine every method call recursively to check for side effects?)
Dan Bryant
Yes, it can be difficult to determine. The point of my post is that *sometimes*, it can be easy. And when it's easy, it might get optimized. As for threading, if the List isn't protected by a lock when you do the traversal, even the unoptimized version is buggy.
Karmastan
No one said "never." The OP asked: "does the compiler know the list.Count does not have to be called each iteration?" to which I responded, "Are you sure?" It seemed to me the OP was taking for granted that the `Count` property *wouldn't* change; my intent was to challenge that assumption.
Dan Tao
+3  A: 

If you take a look at the IL generated for Dan Tao's example you'll see a line like this at the condition of the loop:

callvirt instance int32 [mscorlib]System.Collections.Generic.List`1<int32>::get_Count()

This is undeniable proof that Count (i.e. get_Count()) is called for every iteration of the loop.

AlfredBr
My x86 assembly skills are extremely rusty, but an investigation of the JIT results show a 'call FFFFFFFFF6BA91F0' which might indicate a call to get_Count()
AlfredBr
No, the JIT compiler turns that into a direct CPU register load. It is called "inlining". Small property getters are always inlined in the Release build. You have to take a look at the optimized machine code to see it.
Hans Passant
Right, the IL isn't the final word on what actually happens.
Ben Voigt
+3  A: 

It's worth noting, as nobody else has mentioned it, that there is no knowing from looking at a loop like this what the "Count" property will actually do, or what side effects it may have.

Consider the following cases:

  • A third party implementation of a property called "Count" could execute any code it wished to. e.g. return a Random number for all we know. With List we can be a bit more confident about how it will operate, but how is the JIT to tell these implementations apart?

  • Any method call within the loop could potentially alter the return value of Count (not just a straight "Add" directly on the collection, but a user method that is called in the loop might also party on the collection)

  • Any other thread that happens to be executing concurrently could also change the Count value.

The JIT just can't "know" that Count is constant.

However, the JIT compiler can make the code run much more efficiently by inlining the implementation of the Count property (as long as it is a trivial implementation). In your example it may well be inlined down to a simple test of a variable value, avoiding the overhead of a function call on each iteration, and thus making the final code nice and fast. (Note: I don't know if the JIT will do this, just that it could. I don't really care - see the last sentence of my answer to find out why)

But even with inlining, the value may still be changed between iterations of the loop, so it would still need to be read from RAM for each comparison. If you were to copy Count into a local variable and the JIT could determine by looking at the code in the loop that the local variable will remain constant for the loop's lifetime, then it may be able to further optimise it (e.g. by holding the constant value in a register rather than having to read it from RAM on each iteration). So if you (as a programmer) know that Count will be constant for the lifetime of the loop, you may be able to help the JIT by caching Count in a local variable. This gives the JIT the best chance of optimising the loop. (But there are no guarantees that the JIT will actually apply this optimisation, so it may make no difference to the execution times to manually "optimise" this way. You also risk things going wrong if your assumption (that Count is constant) is incorrect. Or your code may break if another programmer edits the contents of the loop so that Count is no longer constant, and he doesn't spot your cleverness)

So the moral of the story is: The JIT can make a pretty good stab at optimising this case by inlining. Even if it doesn't do this now, it may do it with the next C# version. You might not gain any advantage by manually "optmising" the code, and you risk changing its behaviour and thus breaking it, or at least making future maintenance of your code more risky, or possibly losing out on future JIT enhancements. So the best approach is to just write it the way you have, and optimise it when your profiler tells you that the loop is your performance bottleneck.

Hence, IMHO it's interesting to consider/understand cases like this, but ultimately you don't actually need to know. A little bit of knowledge can be a dangerous thing. Just let the JIT do its thing, and then profile the result to see if it needs improving.

Jason Williams