views:

195

answers:

5

Hi,

A simple but relevant question: Is « my » overwriting memory when called in a loop?
For instance, is it "better" (in terms of memory leaks, performance, speed) to declare it outside of the loop:

my $variable;
for my $number ( @array ) {
    $variable = $number * 5;
    _sub($variable);   
}

Or should I declare it inside the loop:

for my $number ( @array ) {
    my $variable = $number * 5;
    _sub($variable);
}

(I just made that code up, it's not meant to do anything nor be used - as it is - in real life)
Will Perl allocate a new space in memory for each and every one of the for iterations ?

+3  A: 

From your examples above:

  1. A new space for variable will not be allocated everytime, the previous one will be used.

  2. A new space will be allocated for every iteration of loop and will be de-allocated as well in the same iteration.

Aamir
So, considering the fact that I don't need to use these variables outside of the loop, I can safely use "my" inside a for ?
Isaac Clarke
Yes.
Ether
+9  A: 

Aamir already told you what will happen.

I recommend to stick to the second version unless there is some reason to use the first. You don't want to care about the previous state of $variable. It's simplest to start each iteration with a fresh variable. And if variable contains a reference you might actually shoot yourself in the foot if you push that onto an array.

Edit:

Yes, there is a performance hit. Using a recycled variable will be faster. However, it is hard to hell how much faster it will be as this will depend on your specific situation. No matter how much faster it is though, always remember: Premature optimization is the root of all evil.

innaM
OK, thanks for the heads up !
Isaac Clarke
Though cleaner, the second version will incur a performance hit due to the repeated allocation and de-allocation of memory. In practice the difference will almost always be negligible, but it's useful to know that it's there.
fB
Ah yes, I was going to mention premature optimization as well, but forgot. Good point!
fB
Yes, there's a similar statement about assumptions, though... C++ is compiled and wouldn't reallocate a variable just because its declaration was in a loop, so I thought I'd just confirm whether Perl's compiler was intelligent enough to spot the re-use of an automatic variable. In fact, it's not, and the assumptions used here are correct: taking the 'my' out of the loop gave me a 15% speed improvement. That's a 1s gain over 10 million runs, for an average desktop, so I don't see it making much difference to you, however.
ijw
Interesting. But what did you do inside the loop? If there is much to be done (function calls, e.g.), the percentage will be quite different.
innaM
+1  A: 

You are totally safe using "my" inside a for loop or any other block. In general you don't have to worry about memory leaks in perl, but you would be equally safe in this circumstance with a non-garbage-collecting language like C++. A normal variable is deallocated at the end of the block in which it has scope.

Ether
My main concern was in fact this _deallocation/reallocation_ process, whether or not it was resource-consuming. Thanks for your answer anyways !
Isaac Clarke
ah gotcha. Hopefully the compiler would optimize out the allocation/deallocation process in this case, as the loop is pretty tight, but I haven't poked much into the internals. brian d foy is your man :)
Ether
+2  A: 

These are things you aren't supposed to think about with a dynamic language such as Perl. Even though you might get an answer about what the current implementation does, that's not a feature and it isn't something you should rely on.

Define your variables in the shortest scope possible.

However, to be merely curious, you can use the Devel::Peek module to cheat a bit to see the internal (not physical) memory address:

 use Devel::Peek;

 foreach ( 0 .. 5 ) {
   my $var = $_;
   Dump( $var );
   }

In this small case, the address ends up being the same. That's no guarantee that it will always be the same for different situations, or even the same program:

SV = IV(0x9ca968) at 0x9ca96c
  REFCNT = 1
  FLAGS = (PADMY,IOK,pIOK)
  IV = 0
SV = IV(0x9ca968) at 0x9ca96c
  REFCNT = 1
  FLAGS = (PADMY,IOK,pIOK)
  IV = 1
SV = IV(0x9ca968) at 0x9ca96c
  REFCNT = 1
  FLAGS = (PADMY,IOK,pIOK)
  IV = 2
SV = IV(0x9ca968) at 0x9ca96c
  REFCNT = 1
  FLAGS = (PADMY,IOK,pIOK)
  IV = 3
SV = IV(0x9ca968) at 0x9ca96c
  REFCNT = 1
  FLAGS = (PADMY,IOK,pIOK)
  IV = 4
SV = IV(0x9ca968) at 0x9ca96c
  REFCNT = 1
  FLAGS = (PADMY,IOK,pIOK)
  IV = 5
brian d foy
Thanks, that was interesting !
Isaac Clarke
My suspicion is that it's still destroying and recreating the scalar and it happens to be in the same place, however. There seems to be actual work going on.
ijw
@ijw Beware of implementation details. This is a common idiom and a compiler optimization may remove any difference in the two cases in the future (such as what happened to map in void context).
Chas. Owens
+2  A: 

You can benchmark the difference between the two uses using the Benchmark module which is made for these types of micro-benchmarking comparisons:

#!/usr/bin/perl

use strict;
use warnings;

use Benchmark qw( cmpthese );

sub outside {
    my $x;

    for my $y ( 1 .. 1_000_000 ) {
        $x = $y;
    }

    return;
}

sub inside {
    for my $y ( 1 .. 1_000_000 ) {
        my $x = $y;
    }

    return;
}

cmpthese -1 => {
    inside => \&inside,
    outside => \&outside,
};

Results on my Windows XP SP3 laptop:

          Rate  inside outside
inside  4.44/s      --    -25%
outside 5.91/s     33%      --

Predictably, the difference is less pronounced when the body of the loop is executed only once.

That said, I would not declare $x outside the loop unless I needed outside the loop what is assigned to $x inside the loop.

Sinan Ünür
Yes, while not having to redeclare the variable saves a small (you must multiply it by a huge number of it to have an effect on the runtime of your program) amount of time, the safety of limiting the scope of the variable is well worth the (again, tiny) cost.
Chas. Owens
Thanks for the info ! Interesting Module, I'm going to try it right away. I'm actually looping through a lot of parameters using three nested for loops but the script will only be launched once a day, and safety on the machines my company uses is alot more important than a 20% performance improvement on a less-than-one-second execution.
Isaac Clarke