views:

224

answers:

5

What's the difference, from the interpreter's POV, between the following the following programs:

#!/usr/bin/perl -w

use strict;

for (1..10000000) {
    my $jimmy = $_**2;
}

and

#!/usr/bin/perl -w

use strict;

my $jimmy;
for (1..10000000) {
    $jimmy = $_**2;
}

"time" reports for the first program:

real    0m1.519s
user    0m1.513s
sys     0m0.004s

and for the second:

real    0m1.023s
user    0m1.012s
sys     0m0.002s
+2  A: 

The first loop attempts to make the variable declaration for every iteration of the loop and can result in unnecessary processing time.

Granted, it's not much, but this stuff can add up over time, and it is technically slower since more instructions are executed per iteration.

Robert Greiner
The tradeoff is that you get more isolation (if you needed it) from declaring it inside the loop. And it kind of goes with the principal that variables should be declare on the smallest scope that makes sense. Of course, if you're going to be calling something 10 million times, slavish devotion to the standard is not required.
Axeman
i guess i'm confused because i thought the interpreter assigned memory for each variable before executing?
flies
i guess what i'm asking is, what work does the compiler do to "make the variable declaration"
flies
@flies What does any language do? Make room for it somewhere in memory and record the name of the variable in someway somewhere. The assignment is in common to either approach taken in your examples. It should be obvious that declaring the variable multiples times leads to some difference in execution time.
George Marian
@George it is only "obvious" if you have some inkling as to what Perl is doing. A possible, maybe not sensible, implementation would be to make $jimmy statically bound to a memory location. In which case `my $jimmy for ... { } ` would perform the same as `for... { my $jimmy } `
justintime
@justintime Are you actually declaring a new variable if it's bound to one location in memory?
George Marian
+1  A: 

Well one, there's is the issue that you're declaring a new variable with each iteration.

Two, there is the bigger issue of scoping.

Try adding this line after the for in each of those, and see what happens:

print $jimmy;

And, try this as well:

my $jimmy;
for (1..10000000) {
    my $jimmy = $_**2;
}
print $jimmy;

A bit more detail:

A my declares the listed variables to be local (lexically) to the enclosing block, file, or eval. If more than one value is listed, the list must be placed in parentheses.

http://perldoc.perl.org/functions/my.html

You'll likely find this to be a useful read as well:

http://perldoc.perl.org/perlsub.html#Private-Variables-via-my%28%29

George Marian
+8  A: 

The my declaration in Perl has two primary effects; a compile-time one (wherein it allocates a slot on the containing sub's scratchpad, and makes sure that all references to that name within the proper scope are resolved to that particular scratchpad slot), and a runtime one (wherein it resets the value of that pad slot to undef, or to some particular value if you wrote my $var = foo).

The compile-time portion of course has zero amortized runtime cost, but the runtime portion is run once each time execution passes the my declaration. As others have pointed out, your two examples have different performance because they have different semantics in general -- one clears the variable every time through the loop, and the other doesn't.

hobbs
+3  A: 

Since the example programs you have given do not really do anything it is hard to give you a specific reason why one type of declaration would be better than the other. As many other posters have pointed out, declaring the variable in the loop creates a new variable each time. In your examples that creation is redundant, but consider the following examples using closures.

my @closures;
my $jimmy;

for (1 .. 10) {
    $jimmy = $_** 2;
    push @closures, sub {print "$jimmy\n"};
}

and this one:

my @closures;

for (1 .. 10) {
    my $jimmy = $_** 2;
    push @closures, sub {print "$jimmy\n"};
}

In each case the code builds up a series of code references, but in the first example since all the code refs refer to the same $jimmy each one will print 100 when called. In the second example each code ref will print a different number (1, 4, 9, 16, 25, ...)

So in this case the time difference does not really matter since the two blocks of code do very different things.

Eric Strom
+1  A: 
  1. Declaring my outside the loop causes the declaration to occur once. During the declaration, the perl reserves memory for that variable.

  2. Declaring my inside the loop causes the declaration to occur at each interval of the loop.

my is Perl's answer to declaring a variable locally - local was used for something else and does not means the same thing as what it would mean in C. When you declare the variable inside the loop, it is declared in local scope to the loop block, where the block starts/ends at each interval. Not only is the variable declared, but it may also be cleaned up (dereferenced and/or set to undef) at the end of the block (though this changes from Perl versions).

Variables declared outside the loop block are considered "global" (not literally, but in the sense of the loop block). These variables reuse their memory locations, rather than having to search for new addresses.

vol7ron