tags:

views:

278

answers:

8

Is the limit in the following loop (12332*324234) calculated once or every time the loop runs?

for(int i=0; i<12332*324234;i++)
{
    //Do something!
}
+20  A: 

For this it it calculated once, or more likely 0 times.

The compiler will optimize the multiplication away for you.

However this is not always the case if you have something like.

for(int i=0; i<someFunction();i++)
{
    //Do something!
}

Because the compiler is not always able to see what someFunction will return. So even if someFunction does return a constant value every time, if the compiler doesn't know that, it cannot optimize it.

EDIT: As MainMa said in a comment, you are in this situation you can eliminate the cost by doing something like this:

int limit = someFunction();
for(int i=0; i<limit ;i++)
{
    //Do something!
}

IF you are certain that the value of someFunction() will not change during the loop.

KLee1
True. For example, doing a query to a database in `someFunction()` will be really a bad idea. You can easily avoid this by creating an integer before the loop, assign a value to it, and then use this integer in a loop.
MainMa
+2  A: 

There's two ways to interpret your question:

  • Is the multiplication of 12332*324234 evaluated on each loop
  • Is the expression in the loop condition evaluated on each loop

The answer to those two, different, questions are:

  • No, it is in fact evaluated at compile-time, since it involves two constants
  • Yes, if necessary, they are

In other words:

for (int i = 0; i < someString.Length; i++)

If the evaluation of someString.Length is costly, it will incur a penalty on each loop iteration.

Lasse V. Karlsen
I doubt that String.Length works like strlen. However, 'someString' could change at each iteration loop.
Luca
+7  A: 

Actually that won't compile because it will overflow but if you make it a smaller number and open up Reflector you will find something like this.

for (int i = 0; i < 0x3cf7b0; i++)
{

}
ChaosPandion
Good point about the overflow
KLee1
A: 

As @Chaos noted, it won't compile. But if you use a representable expression (like 100*100), the result will probably be hard-coded. On Mono, the CIL includes:

IL_0007:  ldloc.0
IL_0008:  ldc.i4.1
IL_0009:  add
IL_000a:  stloc.0
IL_000b:  ldloc.0
IL_000c:  ldc.i4 10000
IL_0011:  blt IL_0007

As you can see, 100 * 100 is hard-coded as 10000. However, in general it will be evaluated each time, and if a method or property is called, this probably can't be optimized away.

Matthew Flaschen
A: 

It looks to be calculated every time. Disassembly from VS2008.

0000003b  nop              
            for (Int64 i = 0; i < (Int64)12332 * (Int64)324234; i++)
0000003c  mov         qword ptr [rsp+20h],0 
00000045  jmp         000000000000005E 
            {
00000047  nop              
                bool h = false;
00000048  mov         byte ptr [rsp+28h],0 
            }
0000004d  nop              
            for (Int64 i = 0; i < (Int64)12332 * (Int64)324234; i++)
0000004e  mov         rax,qword ptr [rsp+20h] 
00000053  add         rax,1 
00000059  mov         qword ptr [rsp+20h],rax 
0000005e  xor         ecx,ecx 
00000060  mov         eax,0EE538FB8h 
00000065  cmp         qword ptr [rsp+20h],rax 
0000006a  setl        cl   
0000006d  mov         dword ptr [rsp+2Ch],ecx 
00000071  movzx       eax,byte ptr [rsp+2Ch] 
00000076  mov         byte ptr [rsp+29h],al 
0000007a  movzx       eax,byte ptr [rsp+29h] 
0000007f  test        eax,eax 
00000081  jne         0000000000000047 
Zach Dunton
There is no multiply command in the disassembly.
Vincent McNabb
+3  A: 

This is one of the most commonly misunderstood behavior of loops in C#.

Here's what you need to know:

Loop bounds calculations, if non-constant and involving a variable, property-access, function call, or delegate invocation will re-compute the value of the bounds before each iteration of the loop.

So, for example:

for( int i = 0; i < 1234*1234; i++ ) { ... }

In this case, the expression 1234*1234 is a compile time constant, and as a result will not be recomputed on each iteration. In fact, it is calculated at compile time and replaced with a constant.

However, in this case:

int k = 10;
for( int i = 0; i < k; i++ ) { k -= 1; ... }

The value of k has to be examined on each iteration. After all it can change .. in in this example does. Fortunately, since k is simply a local variable, the cost of accessing it is very low - and in many cases it will either be retained in the local CPU cache or perhaps even maintained in a register (depending on how the JIT processes and emits the machine code).

In the case of something like the following:

IEnumerable<int> sequence = ...;
for( int i = 0; i < sequence.Count(); i++ ) { ... }

The cost of computing the sequence.Count() can be quite expensive. And since it's evaluated on each iteration of the loop it can add up quickly.

The compiler cannot optimize away the calls to methods or properties that occur within the loop bounds expression because they may also change with each iteration. Imagine if the loop above was written as:

IEnumerable<int> sequence = ...;
for( int i = 0; i < sequence.Count(); i++ ) {
    sequence = sequence.Concat( anotherItem );
}

Clearly sequence is changing on each iteration ... and therefore the Count() is likely to be different on each iteration. The compiler does not attempt to perform some static analysis to determine whether the loop bounds expression could be constant ... that would be extremely complicated, if not impossible. Instead, it assumes that if the expression is not a constant it must be evaluated on each iteration.

Now, in most cases, the cost of computing the bounds constraint for a loop is going to be relatively inexpensive, so you don't have to worry about it. But you do need to understand how the compiler treats loop bounds like this. Also, as a developer you need to be careful about using properties or methods that have side-effects as part of a bounds expression - after all, these side effects will occur on each iteration of the loop.

LBushkin
A: 

Yes, comparison value is calculated every loop cycle.

If you absolutely have to use for-loop, which is rarely really necessary anymore, afaik there is only one "good" loop template:

for (int i = first(), last = last(); i != last; ++i)
{
// body
}

Notice the prefix increment too.

Grozz
This is misguided for many reasons. It's not a question of being "necessary." For loops are never necessary, but there's sometimes the cleanest way to express something. Not all for conditions use a method at all. In fact, the one in the question does not. Of those that do, sometimes the method *must* be called each time, and in other cases, the performance difference is negligible. Finally, pre and post-increment makes absolutely no practical difference here.
Matthew Flaschen
I'm pretty sure that developer who uses this template knows when the method has to be called each time and will adjust it.Anyway, it's a good style to use that "for pattern" which was introduced in C++ by Scott Meyers and is still very useful.
Grozz
`++i` and `i++` makes no difference because the compiler will optimize them to be semantically identical.
KLee1
+1  A: 

First of all the for loop in the question will not compile. But lets say it was

            for (int  i = 0; i < 20; i++)
        {
            Console.WriteLine(i);
            i++;

        }

VS

            for (int  i = 0; i < 10*2; i++)
        {
            Console.WriteLine(i);
            i++;

        }

the IL code is exactly same .

   .method private hidebysig static void Main(string[] args) cil managed
{
    .entrypoint
    .maxstack 2
    .locals init (
        [0] int32 i,
        [1] bool CS$4$0000)
    L_0000: nop 
    L_0001: ldc.i4.0 
    L_0002: stloc.0 
    L_0003: br.s L_0016
    L_0005: nop 
    L_0006: ldloc.0 
    L_0007: call void [mscorlib]System.Console::WriteLine(int32)
    L_000c: nop 
    L_000d: ldloc.0 
    L_000e: ldc.i4.1 
    L_000f: add 
    L_0010: stloc.0 
    L_0011: nop 
    L_0012: ldloc.0 
    L_0013: ldc.i4.1 
    L_0014: add 
    L_0015: stloc.0 
    L_0016: ldloc.0 
    L_0017: ldc.i4.s 20
    L_0019: clt 
    L_001b: stloc.1 
    L_001c: ldloc.1 
    L_001d: brtrue.s L_0005
    L_001f: call int32 [mscorlib]System.Console::Read()
    L_0024: pop 
    L_0025: ret 
}

Now even if I replace it with a function in the loop like

class Program
{
    static void Main(string[] args)
    {
        for (int  i = 0; i < Foo(); i++)
        {
            Console.WriteLine(i);
            i++;

        }
        Console.Read();
    }

    private static int Foo()
    {
        return 20;
    }

I get this IL code

    .method private hidebysig static void Main(string[] args) cil managed
{
    .entrypoint
    .maxstack 2
    .locals init (
        [0] int32 i,
        [1] bool CS$4$0000)
    L_0000: nop 
    L_0001: ldc.i4.0 
    L_0002: stloc.0 
    L_0003: br.s L_0016
    L_0005: nop 
    L_0006: ldloc.0 
    L_0007: call void [mscorlib]System.Console::WriteLine(int32)
    L_000c: nop 
    L_000d: ldloc.0 
    L_000e: ldc.i4.1 
    L_000f: add 
    L_0010: stloc.0 
    L_0011: nop 
    L_0012: ldloc.0 
    L_0013: ldc.i4.1 
    L_0014: add 
    L_0015: stloc.0 
    L_0016: ldloc.0 
    L_0017: call int32 TestBedForums.Program::Foo()
    L_001c: clt 
    L_001e: stloc.1 
    L_001f: ldloc.1 
    L_0020: brtrue.s L_0005
    L_0022: call int32 [mscorlib]System.Console::Read()
    L_0027: pop 
    L_0028: ret 
}

which looks same to me.

So to me it looks like there is no difference in FOR LOOP with finite limit , another with calculate limit , and last with limit coming from a function.

So As long as the code copiles , you know what you are doing in a mammoth loop like this and you have enough memory in process i think it will work and produce same perf. (in C#)

Ashwani Roy