ansaurus

Question

Is there any overhead to declaring a variable within a loop? (C++)

Answer 1

+5 A:

Most modern compilers will optimize this for you. That being said I would use your first example as I find it more readable.

Andrew Hare 2009-06-11 19:01:58

I don't really count it as an optimization. Since they are local variables, the stack space is just allocated at the beginning of the function. There's no real "creation" involved to harm performance (unless a constructor is being called, which is completely another story).

Mehrdad Afshari 2009-06-11 19:06:33

You are right, "optimization" is the wrong word but I am at a loss for a better one.

Andrew Hare 2009-06-11 19:11:16

problem is that such an optimizer will use live range analysis, and both variables are rather dead.

MSalters 2009-06-12 15:17:57

Answer 2

+4 A:

These days it is better to declare it inside the loop unless it is a constant as the compiler will be able to better optimize the code (reducing variable scope).

Joshua 2009-06-11 19:02:28

I doubt it will affect optimization -- if the compiler performs any sort of data flow analysis, it can figure out that it's not being modified outside the loop, so it should produce the same optimized code in both cases.

Adam Rosenfield 2009-06-11 19:07:00

It won't figure it out if you have two different loops using the same temp variable name though.

Joshua 2009-06-11 19:26:50

Answer 3

A:

The only way to be sure is to time them. But the difference, if there is one, will be microscopic, so you will need a mighty big timing loop.

More to the point, the first one is better style because it initializes the variable var, while the other one leaves it uninitialized. This and the guideline that one should define variables as near to their point of use as possible, means that the first form should normally be preferred.

anon 2009-06-11 19:03:45

Wow Neil. Working on a cell phone w/o spell checking?

crashmstr 2009-06-11 19:06:17

lol Neil you've been making typo-s galore lately :P

GMan 2009-06-11 19:06:37

To be fair, British English spells a number of words with -ise which would be spelled -ize in American English.

Adam Rosenfield 2009-06-11 19:08:30

it was more than that, plus I probably wouldn't have noticed if it was.

crashmstr 2009-06-11 19:10:00

Thanks for the corrections, but in future, please don't change my English spelling to US spelling.

anon 2009-06-11 19:10:42

Wow. Are we debritifying now? :-)

Nosredna 2009-06-11 20:27:54

Hey hey hey, my spell checker didn't have the English spellings so in my defense... well I don't have a defense. :|

GMan 2009-06-11 20:37:03

Do you have a defence? :-)

Nosredna 2009-06-12 00:51:32

"The only way to be sure is to time them." -1 untrue. Sorry but another post proved this wrong by comparing the generated machine language and finding it essentially identical. I don't have any problem with your answer in general, but isn't wrong what the -1 is for?

Bill K 2009-06-12 00:58:25

Examining emitted code is certainly useful, and in a simple case like this may be sufficient. However, in more complex cases issues such as locality of reference rear their head, and these can only be tested by timing the execution.

anon 2009-06-12 07:03:02

Answer 4

+29 A:

Stack space for local variables is usually allocated in function scope. So no stack pointer adjustment happens inside the loop, just assigning 4 to var. Therefore these two snippets have the same overhead.

laalto 2009-06-11 19:03:52

+1 This is the most correct answer so far...

Zifre 2009-06-11 19:09:12

I wish those guys who teach at out college at least knew this basic thing. Once he laughed at me declaring a variable inside a loop and I was wondering what's wrong until he cited performance as the reason not to do so and I was like "WTF!?".

Mehrdad Afshari 2009-06-11 19:35:57

Are you sure you should be talking about stack space right away. A variable like this could also be in a register.

toto 2009-06-25 11:46:39

Answer 5

+19 A:

For primitive types and POD types, it makes no difference. The compiler will allocate the stack space for the variable at the beginning of the function and deallocate it when the function returns in both cases.

For non-POD class types that have non-trivial constructors, it WILL make a difference -- in that case, putting the variable outside the loop will only call the constructor and destructor once and the assignment operator each iteration, whereas putting it inside the loop will call the constructor and destructor for every iteration of the loop. Depending on what the class' constructor, destructor, and assignment operator do, this may or may not be desirable.

Adam Rosenfield 2009-06-11 19:05:10

This doesn't seem true. If you use a nonPOD type and assign it in the loop over and over again as in his example, that assignment to a new variable is calling the constructor over and over again, too.

Brian 2009-06-11 19:16:47

Correct idea wrong reason. Variable outside the loop. Constructed once, destroyed once but asignment operator applied every iteration. Variable inside the loop. Constructe/Desatructor aplied every iteration but zero assignment operations.

Martin York 2009-06-11 19:34:59

This is the best answer but these comments are confusing. There's a big difference between calling a constructor and an assignment operator.

Andrew Grant 2009-06-11 19:50:02

It *is* true if the loop body does the assignment anyway, not just for initialization. And if there's just a body-independent/constant initialization, the optimizer can hoist it.

peterchen 2009-06-11 19:59:42

@Andrew Grant: Why. Assignment operator is usually defined as copy construct into tmp followed by swap (to be exception safe) followed by destroy tmp. Thus assignment operator is not that different to construction/destroy cycle above. See http://stackoverflow.com/questions/255612/c-dynamically-allocating-an-array-of-objects/255744#255744 for example of typical assignment operator.

Martin York 2009-06-12 08:32:41

Answer 6

A:

thats not true there is overhead however its neglect able overhead.

Even though probably they will end up at same place on stack It still assigns it. It will assign memory location on stack for that int and then free it at the end of }. Not in heap free sense in sense it will move sp (stack pointer) by 1. And in your case considering it only has one local variable it will just simply equate fp(frame pointer) and sp

Short answer would be: DONT CARE EITHER WAY WORKS ALMOST THE SAME.

But try reading more on how stack is organized. My undergrad school had pretty good lectures on that If you wanna read more check here http://www.cs.utk.edu/~plank/plank/classes/cs360/360/notes/Assembler1/lecture.html

grobartn 2009-06-11 19:24:44

Again, -1 untrue. Read the post that looked at the assembly.

Bill K 2009-06-12 00:59:02

nope you are wrong.look at the assembler code generated with that code

grobartn 2009-06-12 13:10:04

Answer 7

+10 A:

They are both the same, and here's how you can find out, by looking at what the compiler does (even without optimisation set to high):

Look at what the compiler (gcc 4.0) does to your simple examples:

1.c:

main(){ int var; while(int i < 100) { var = 4; } }

gcc -S 1.c

1.s:

_main:
    pushl %ebp
    movl %esp, %ebp
    subl $24, %esp
    movl $0, -16(%ebp)
    jmp L2
L3:
    movl $4, -12(%ebp)
L2:
    cmpl $99, -16(%ebp)
    jle L3
    leave
    ret

2.c

main() { while(int i < 100) { int var = 4; } }

gcc -S 2.c

2.s:

_main:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $24, %esp
        movl    $0, -16(%ebp)
        jmp     L2
L3:
        movl    $4, -12(%ebp)
L2:
        cmpl    $99, -16(%ebp)
        jle     L3
        leave
        ret

From these, you can see two things: firstly, the code is the same in both.

Secondly, the storage for var is allocated outside the loop:

         subl    $24, %esp

And finally the only thing in the loop is the assignment and condition check:

L3:
        movl    $4, -12(%ebp)
L2:
        cmpl    $99, -16(%ebp)
        jle     L3

Which is about as efficient as you can be without removing the loop entirely.

Alex Brown 2009-06-11 19:55:47

"Which is about as efficient as you can be without removing the loop entirely" Not quite. Partially unrolling the loop (doing it say 4 times per pass) would speed it up dramatically. There are probably many other ways to optimize... although most modern compilers would probably realize that there's no point in looping at all. If 'i' was used later, it'd simply set 'i' = 100.

darron 2009-09-27 22:18:06

that's assuming the code changed to incremented 'i' at all... as is it's just a forever loop.

darron 2009-09-27 22:20:13

Answer 8

+2 A:

Both loops have the same efficiency. They will both take an infinite amount of time :) It may be a good idea to increment i inside the loops.

Larry Watanabe 2009-06-11 20:27:15

Yes, stack overflow is always the most correct answer!

Even Mien 2009-06-11 20:45:37

Ah yes, I forgot to address space efficiency - that's ok - 2 ints for both.It just seems odd to me that programmers are missing the forest for the tree -- all these suggestions about some code that doesn't terminate.

Larry Watanabe 2009-06-11 21:08:17

It's OK if they don't terminate. Neither one of them is called. :-)

Nosredna 2009-06-12 00:53:50

Answer 9

+1 A:

For a built-in type there will likely be no difference between the 2 styles (probably right down to the generated code).

However, if the variable is a class with a non-trivial constructor/destructor there could well be a major difference in runtime cost. I'd generally scope the variable to inside the loop (to keep the scope as small as possible), but if that turns out to have a perf impact I'd look to moving the class variable outside the loop's scope. However, doing that needs some additional analysis as the semantics of the ode path may change, so this can only be done if the sematics permit it.

An RAII class might need this behavior. For example, a class that manages file access lifetime might need to be created and destroyed on each loop iteration to manage the file access properly.

Suppose you have a LockMgr class that acquires a critical section when it's constructed and releases it when destroyed:

while (i< 100) {
    LockMgr lock( myCriticalSection); // acquires a critical section at start of
                                      //    each loop iteration

    // do stuff...

}   // critical section is released at end of each loop iteration

is quite different from:

LockMgr lock( myCriticalSection);
while (i< 100) {

    // do stuff...

}

Michael Burr 2009-06-12 00:41:58

Answer 10

A:

This not an answer but it does not allow me to add comment yet, so I am posting this way

@lalto, @Alex Do you mean that all variable inside functions are allocated space at the start of function and does it imply that all variables used inside the functions are actually using the memory till the end of the function. e.g. in following function

void MyFunction(void) {
 int a;
 ...
 {
  int b;
  ...
 }
 ...
 {
   int c[50];
   ...
 }
 ...
 int d;
 ...
}

1) Do you mean that a,b,c & d all are allocated space at start of function and memory is not deallocated till the end of function.

2) Why cannot it allocate only when its needed and deallocate when its out of scope?

Thanks

2009-06-12 07:21:02

I don't think it's a hard and fast rule, but yes to 1. One reason it does it at the start is that allocating space for 4 variables on the stack is no more work than allocating 1.The stack pointer is simply advanced by the sum of the sizes of the variables required.Deallocation is similarly cheap, or even free.Note that the storage used only in the inner block scope (b,c) can be shared, ie c can reuse the storage b no longer uses.The cost and timing of constructors and destructors are associated with the actual declarations and scope of the variables.

Alex Brown 2009-06-12 10:42:38

Actually, with modern optimizers, you can't say a lot about this.The compiler will look at the places where a,b,c and d are actually needed. If the last use of a precedes the first use of d, the compiler can make them share memory. And if there's an early return (eg. checking if an argument is NULL), the optimizer might even delay allocating memory until after that check.

MSalters 2009-06-12 15:15:05

Answer 11

A:

With only two variables, the compiler will likely be assign a register for both. These registers are there anyway, so this doesn't take time. There are 2 register write and one register read instruction in either case.

MSalters 2009-06-12 15:20:57

ansaurus

tags:

views:

answers:

Is there any overhead to declaring a variable within a loop? (C++)

related questions