views:

4621

answers:

16

Hi,

I have always wondered if, in general, declaring a throw-away variable before a loop, as opposed to repeatedly inside the loop, makes any (performance) difference? A (quite pointless example) in Java:

a) declaration before loop:

double intermediateResult;
for(int i=0;i<1000;i++){
    intermediateResult = i;
    System.out.println(intermediateResult);
}

b) declaration (repeatedly) inside loop:

for(int i=0;i<1000;i++){
    double intermediateResult = i;
    System.out.println(intermediateResult);
}

Which one is better, a or b?

I suspect that repeated variable declaration (example b) creates more overhead in theory, but that compilers are smart enough so that it doesn't matter. Example b has the advantage of being more compact and limiting the scope of the variable to where it is used. Still, I tend to code according example a...

Edit: I am especially interested in the Java case.

+1  A: 

I think it depends on the compiler and is hard to give a general answer.

SquidScareMe
+5  A: 

language dependant - iirc C# optimises this so no difference, but JS (for example) will do the whole memory allocation shebang each time

annakata
A: 

Even if I know my compiler is smart enough, I won't like to rely on it, and will use the a) variant.

The b) variant makes sense to me only if you desperately need to make the intermediateResult unavailable after the loop body. But I can't imagine such desperate situation, anyway....

EDIT: Jon Skeet made a very good point, showing that variable declaration inside a loop can make an actual semantic difference.

Abgan
A: 

I suspect a few compilers could optimize both to be the same code, but certainly not all. So I'd say you're better off with the former. The only reason for the latter is if you want to ensure that the declared variable is used only within your loop.

Stew S
+1  A: 

As a general rule, I declare my variables in the inner-most possible scope. So, if you're not using intermediateResult outside of the loop, then I'd go with B.

Christopher
+14  A: 

It depends on the language and the exact use. For instance, in C# 1 it made no difference. In C# 2, if the local variable is captured by an anonymous method (or lambda expression in C# 3) it can make a very signficant difference.

Example:

using System;
using System.Collections.Generic;

class Test
{
    static void Main()
    {
        List<Action> actions = new List<Action>();

        int outer;
        for (int i=0; i < 10; i++)
        {
            outer = i;
            int inner = i;
            actions.Add(() => Console.WriteLine("Inner={0}, Outer={1}", inner, outer));
        }

        foreach (Action action in actions)
        {
            action();
        }
    }
}

Output:

Inner=0, Outer=9
Inner=1, Outer=9
Inner=2, Outer=9
Inner=3, Outer=9
Inner=4, Outer=9
Inner=5, Outer=9
Inner=6, Outer=9
Inner=7, Outer=9
Inner=8, Outer=9
Inner=9, Outer=9

The difference is that all of the actions capture the same outer variable, but each has its own separate inner variable.

Jon Skeet
+2  A: 

In my opinion, b is the better structure. In a, the last value of intermediateResult sticks around after your loop is finished.

Edit: This doesn't make a lot of difference with value types, but reference types can be somewhat weighty. Personally, I like variables to be dereferenced as soon as possible for cleanup, and b does that for you,

R. Bemrose
+21  A: 

Which is better, a or b?

From a performance perspective, you'd have to measure it. (And in my opinion, if you can measure a difference, the compiler isn't very good).

From a maintainence perspective, b is better. Declare and initialize variables in the same place, in the narrowest scope possible. Don't leave a gaping hole between the declaration and the initalization, and don't polute namespaces you don't need to.

Daniel Earwicker
+3  A: 

I would always use A (rather than relying on the compiler) and might also rewrite to:

for(int i=0, double intermediateResult=0; i<1000; i++){
    intermediateResult = i;
    System.out.println(intermediateResult);
}

This still restricts intermediateResult to the loop's scope, but doesn't redeclare during each iteration.

Triptych
Do you conceptually want the variable to live for the duration of the loop instead of separately per iteration? I rarely do. Write code which reveals your intention as clearly as possible, unless you've got a very, very good reason to do otherwise.
Jon Skeet
Ah, nice compromise, I never thought of this! IMO, the code does become a bit less visually 'clear' though)
Rabarberski
@Jon - I have no idea what the OP is actually doing with the intermediate value. Just thought it was an option worth considering.
Triptych
+2  A: 

This is a gotcha in VB.net. The VB result won't reinitialize the var in this example:

For i as Integer = 1 to 100
  Dim j as Integer
  Console.WriteLine(j)
  j = i
Next

' output: 0 1 2 3 4...

This will print 0 the first time (VB vars have default values when declared!) but i each time after that.

If you add a = 0, though, you get what you might expect:

For i as Integer = 1 to 100
  Dim j as Integer = 0
  Console.WriteLine(j)
  j = i
Next

'output: 0 0 0 0 0...
Michael Haren
I've been using VB.NET for years and hadn't come across this!!
ChrisA
Yes, it's unpleasant to figure this out in practice.
Michael Haren
Here is a reference about this from Paul Vick: http://www.panopticoncentral.net/archive/2006/03/28/11552.aspx
ferventcoder
+6  A: 

Well I ran your A and B examples 20 times each, looping 100 million times.(JVM - 1.5.0)

A: average execution time: .074 sec

B: average execution time : .067 sec

To my surprise B was slightly faster. As fast as computers are now its hard to say if you could accurately measure this. I would code it the A way as well but I would say it doesn't really matter.

Mark Robinson
You beat me I was just about to post my results for profiling, I got more or less the same and yes surprisingly B is faster really would have thought A if I had needed to bet on it.
Mark Davidson
Ok cool, yeah i only looked at execution time, as R. Bemrose pointed out in A the variable sticks around after the loop has completed. Did you profile results tell you anything about memory usage ?
Mark Robinson
Not much surprise - when variable is local to the loop, it does not need to be preserved after each iteration, so it can stay in a register.
Arkadiy
+1 for **actually testing it**, not just an opinion/theory the OP could have made up himself.
MGOwen
+2  A: 

A co-worker prefers the first form, telling it is an optimization, preferring to re-use a declaration.

I prefer the second one (and try to persuade my co-worker! ;-)), having read that:

  • It reduces scope of variables to where they are needed, which is a good thing.
  • Java optimizes enough to make no significant difference in performance. IIRC, perhaps the second form is even faster.

Anyway, it falls in the category of premature optimization that rely in quality of compiler and/or JVM.

PhiLho
+5  A: 

Following is what i wrote and compile in .NET

double r0;
for (int i = 0; i < 1000; i++) {
    r0 = i*i;
    Console.WriteLine(r0);
}

for (int j = 0; j < 1000; j++) {
    double r1 = j*j;
    Console.WriteLine(r1);
}

This is what i get from reflector when IL is rendered back into code

for (int i = 0; i < 0x3e8; i++)
{
    double r0 = i * i;
    Console.WriteLine(r0);
}
for (int j = 0; j < 0x3e8; j++)
{
    double r1 = j * j;
    Console.WriteLine(r1);
}

So both look exactly same after compilation. In managed languages code is converted into IL/ByteCode and at time of execution its converted into machine language. So at machine language double may not even be created on stack it may be just a register as code reflect that it is a temp variable for WriteLine function. There are whole set optimization rules just for loops. So average guy shouldn't be worried about it specially in managed languages. There are cases were you can optimize manage code e.g if you have to concatenate large number of strings using just string a; a+=anotherstring[i] vs using StringBuilder. There is very big difference in performance between both. There are alot of such cases where compiler cannot optimize your code because it cannot figure out what is intended in bigger scope. But it can pretty much optimize basic things for you.

affan
A: 

I'm about a year late to the discussion, and as I've read in most threads, there isn't a definite answer. But I've always thought that if you declare your variables inside of your loop then you're wasting memory. If you have something like this:

for(;;) {
  Object o = new Object();
}

Then not only does the object need to be created for each iteration, but there needs to be a new reference allocated for each object. It seems that if the garbage collector is slow then you'll have a bunch of dangling references that need to be cleaned up.

However, if you have this:

Object o;
for(;;) {
  o = new Object();
}

Then you're only creating a single reference and assigning a new object to it each time. Sure, it might take a bit longer for it to go out of scope, but then there's only one dangling reference to deal with.

What am I missing?

R. Carr
A: 

I appreciate the conversation, however, it appears you all seem to be off-track which is leading to your misconceptions and surprise as to method B being faster. The reason B is faster is because the compiler doesn't have to be "smart" to optimize the code, as the local loop variable is readily converted into register mathematics. You'll find even better results using a native register sized value (float/int). While compilers have come a long way, it is important that we as programmers stay on top of our game and don't force compilers to work hard to interpret what we meant. If you only need the variable within the loop, declare it local to the loop. Also, only use double-precision (i.e. double your OS register size) when it is actually needed. Sloppy coding is the bane of optimization.

D. Mergens
A: 

A) is a safe bet than B).........Imagine if you are initializing structure in loop rather than 'int' or 'float' then what?

like

typedef struct loop_example{

JXTZ hi; // where JXTZ could be another type...say closed source lib // you include in Makefile

}loop_example_struct;

//then....

int j = 0; // declare here or face c99 error if in loop - depends on compiler setting

for ( ;j++; ) { loop_example loop_object; // guess the result in memory heap? }

You are certainly bound to face problems with memory leaks!. Hence I believe 'A' is safer bet while 'B' is vulnerable to memory accumulation esp working close source libraries.You can check usinng 'Valgrind' Tool on Linux specifically sub tool 'Helgrind'.

virgoptrex