tags:

views:

424

answers:

9

If we have the following 2 snippets of code in c++ that do the same task:

int a, b=somenumber;
while(b > 0)
{
a = b % 3;
b /= 3;
}

or

int b=somenumber;
while(b > 0)
{
int a=b%3;
b /= 3;
}

I don't know much about computer architecture/c++ design, but i think that the first code is faster because it declares the integer a at the beginning and just uses it in the while-loop, and in the second code the integer a is being declared everytime the while-loop starts over. Can some one help me with this, am i correct or what and why ?

A: 

The first one SHOULD be faster; however the compiler is usually smart enough to optimise this itself so probably won't matter.

For purities sake though, the answer is the first one

EDIT: It is faster because it requires only one allocation as opposed to N (N being the number of iterations the while loop will perform).

Chris
AFAIK this is wrong. All local variables are stack-allocated only once per function call.
Michael Borgwardt
I am not sure that is correct. The local `a` isn't used in the second example, so the compiler may safely get rid of it. And in any case I would expect it to reuse the same stack space for each iteration.
Brian Rasmussen
There is no allocation in the loop and "initialization" (assignment) is performed in both cases (or in neither one, due to compiler being smart).
Michael Krelin - hacker
I had presumed he had meant to use "a" within the loop.
Chris
The declaration will only affect the lexical scoping of the variable, not the size of the frame allocated for the funciton
Dan Andreatta
@Michael: "stack allocated" is one way to implement local variables. However, there are other ways, such as registers. Those are facter but scarce. As a result, compilers think very hard about the allocation of variables to stack slots or registers, and can change their mind within a single function. But for such a simple case (two int vars) you can safely assume there is NO stack allocation whatsoever.
MSalters
MSalters, I don't want to claim it's absolutely impossible, but I'd expect stack frame to be allocated when entering the function, not just any scope, especially the loop.
Michael Krelin - hacker
+1  A: 

No, it can't be "declared" in the loop, as it is declared at compile-time. I'd say they're equal, but the second one might have been faster if the type of the variable was something more complicated, having constructor and destructor.

Michael Krelin - hacker
+9  A: 

The int declaration is information for the compiler and does not translate to an instruction that has to be coded. So it makes no difference. Declaring the int inside the loop will not slop the loop down. Why not try compiling both for yourself and get the compiler to output assembly code so you can see for yourself.

invariant
+5  A: 

Seriously, does it really matter ? This is the type of the micro-optimizations you should try to avoid. Write the code which is more readable which IMHO is the second loop. The compiler is good enough to do the optimization for these type of things and I would leave it to do that.

Naveen
I don't mind the down vote, but a reason would be good :-)
Naveen
Isn't it obvious? Have you even tried to answer the question? ;-)
Michael Krelin - hacker
His point is; it's "faster" to not bother with this micro-optimization than actually trying to optimize it: so definitely the first proposed code is faster, because the OP thinks that is the fastest.
Pindatjuh
@Michael: In my opinion yes. If it is not obvious from above, my answer is "It doesn't matter"
Naveen
Well, I think it does answer the question. And I agree it's much better to declare variables as locally as possible for readability's sake.
Matthieu M.
Naveen, Q: "am i correct or what and why ?", A: "it doesn't matter". This is exactly what I'd say qualifies for *not* answering the question.
Michael Krelin - hacker
While I believe everyone is right in that this is most likely premature optimization, one can't really know that for sure without knowing more about where this code is running and, perhaps more critically, a profile of where time is spent in the code. If this is happening hundreds of billions of times in a loop and not much else is going on, then perhaps any difference here (although if there's any at all, one does need a new compiler) is significant. Premature optimization sucks, but the whole point is that we can't know where to optimize until we know where we spend our time.
Daniel Papasian
+2  A: 

There is no "faster" in the C++ standard, except for performance guarantees in the standard library. An optimizing compiler would likely just eliminate a, since it's not used. Alternately, it could allocate all the memory the function needed for all local variables at once, and then it wouldn't make any difference either.

The only legitimate question about low-level language constructs like this is whether your particular implementation runs them faster or slower, and the best way to find it out is to time it yourself. You'll find that a whole lot of these things simply don't matter, and if you examine the generated code you'll often find that compilers do the same thing with different ways of writing code.

Usually, looking for micro-optimizations is a bad idea, but if you're trying to set up a general style it may be worth it (using ++i rather than i++, for example). However, if you're setting up a style for any purpose other than readability, you should have good reasons for doing it. In this case, that means testing for performance.

David Thornley
What I would have said, but much more eloquent than I could ever be.
Martin York
+1  A: 

Theoretically the first option might be faster. In practice I'd expect a and b to be put into registers in such a way that the generated assembly comes out identical (which you can verify in the compiled binary). If you're executing the loop enough times that you think there may be a difference, the only way to know is to measure. If your profiler can't tell a different, code it in the way that makes the code the clearest to future maintainers.

In general (as already mentioned) these types of optimizations won't provide any sort of meaningful improvement in program performance. You should instead look for algorithmic and design optimizations.

Mark B
A: 

Generally speaking it is better to pre-allocate variables you want to use. As your second example is re-allocating the variable everytime. Still, the compiler should generate near equivliant assembly code in the above situations, since the example is so trivial (only using an integer). If you were constructing an object everytime in the loop there could be significant penalty.

That said, it is important to realize that in your first example a and b will be accesible outside the while loop, while in your second example they will not be. This is a functional difference that needs to be considered when writing loops in this manner.

windfinder
There's no reallocation. Also, it is important to allocate variables nearest to usage.
Pavel Radzivilovsky
+14  A: 

There should be no difference, but to be extra empirical (anal?) I tested this with g++, creating a function for each of the code snippets. Both with and without optimizations it generated identical code no matter where the int a declaration is.

#include <iostream>

int variant_a(int b)
{
        int a;
        while(b > 0)
        {
                a = b % 3;
                b /= 3;
        }
        return b;
}

int variant_b(int b)
{
        while(b > 0)
        {
                int a = b % 3;
                b /= 3;
        }
        return b;
}

int main()
{
        std::cout << variant_a(42) << std::endl;
        std::cout << variant_b(42) << std::endl;
}

This is the unoptimized loop:

_Z9variant_ai:
.LFB952:
        pushl   %ebp
.LCFI0:
        movl    %esp, %ebp
.LCFI1:
        subl    $24, %esp
.LCFI2:
        jmp     .L2
.L3:
        movl    8(%ebp), %eax
        movl    %eax, -20(%ebp)
        movl    $1431655766, -24(%ebp)
        movl    -24(%ebp), %eax
        imull   -20(%ebp)
        movl    %edx, %ecx
        movl    -20(%ebp), %eax
        sarl    $31, %eax
        subl    %eax, %ecx
        movl    %ecx, %eax
        addl    %eax, %eax
        addl    %ecx, %eax
        movl    -20(%ebp), %edx
        subl    %eax, %edx
        movl    %edx, %eax
        movl    %eax, -4(%ebp)
        movl    8(%ebp), %eax
        movl    %eax, -20(%ebp)
        movl    $1431655766, -24(%ebp)
        movl    -24(%ebp), %eax
        imull   -20(%ebp)
        movl    %edx, %ecx
        movl    -20(%ebp), %eax
        sarl    $31, %eax
        movl    %ecx, %edx
        subl    %eax, %edx
        movl    %edx, %eax
        movl    %eax, 8(%ebp)
.L2:
        cmpl    $0, 8(%ebp)
        jg      .L3
        movl    8(%ebp), %eax
        leave
        ret

and the optimized one:

_Z9variant_ai:
.LFB968:
        pushl   %ebp
.LCFI0:
        movl    %esp, %ebp
.LCFI1:
        pushl   %ebx
.LCFI2:
        movl    8(%ebp), %ebx
        testl   %ebx, %ebx
        jle     .L2
        movl    $1431655766, %ecx
        .p2align 4,,7
        .p2align 3
.L5:
        movl    %ebx, %eax
        imull   %ecx
        movl    %ebx, %eax
        sarl    $31, %eax
        movl    %edx, %ebx
        subl    %eax, %ebx
        jne     .L5
.L2:
        movl    %ebx, %eax
        popl    %ebx
        popl    %ebp
        ret
calmh
At least someone actually backed up the claim with data. Not that I really doubted, but I prefer facts to suppositions.
Matthieu M.
Errr... I hope you've edited this post after I press the button "Add Comment". (Hint; variant_a and variant_b are identical?)
Pindatjuh
Your two functions a and b are the same, shouldn't they be different?
invariant
@Pindatjuh, @invariant. Woops. Naturally they had the same difference as in the question, my error in pasting to SO.
calmh
A: 

I don't believe there would be any difference in practice. There are no memory allocations involved, because the memory for automatic variables is allocated or set aside at compile time.

Theoretically I think the second could easily be faster as well: the compiler has more information about where and how the variables are used (e.g may-be you reuse the same variable for something completely unrelated later).

You might start to worry about such things when you are dealing with types that are expensive to construct. E.g should I declare a std::vector in the inner loop, or should I declare it before the loop and clear() it at the beginning of the loop body (reusing the allocated memory).

UncleBens