views:

722

answers:

8

I am quite new in Perl and I woud like to know which of the following loops is more efficient:

my @numbers = (1,3,5,7,9);
foreach my $current (@numbers){
    print "$current\n";
}

or

my @numbers = (1,3,5,7,9);
foreach (@numbers){
    print "$_\n";
}

I want to know this in order to know if the use of $_ is more efficient because is place in a register because is commonly used or not. I have written some code and I'm trying to clean it up and I've found out that I'm using the first loop more often than the second one.

+6  A: 

You could have a look at this tutorial, there also is a chapter "Benchmark Your Code" you could use to compare those two ways.

schnaader
+14  A: 

Have you identified that there is a performance problem in sections of code that are making use of these loops? If not, you want to go for the one that is more readable and thus more maintainable. Any difference in speed will probably be negligible, especially compared to other parts of your system. Always code for maintainability first, then profile, then code for performance

"Premature optimisation is the root of all evil"[1]

[1] Knuth, Donald. Structured Programming with go to Statements, ACM Journal Computing Surveys, Vol 6, No. 4, Dec. 1974. p.268.

MrWiggles
I'm trying to make the code look more "Perly" rather than optimizing... and I have noticed that Perl developers usually use $_ that is one of my main reasons...
mandel
Besides, that wasn't the question. The OP didn't ask _if_ he should optimize? He just asked which is faster.
Nathan Fellman
@ mandel: in this case you should have asked which one is more Perlish :P
tunnuz
true... but I'm also interested if Perlish is efficient or not ;)
mandel
@Nathan - if the OP isn't interested in optimising then why bother about which is faster? My point still stands, write for readability, not performance
MrWiggles
@mandel: There are idioms worth using in Perl, but people overvalue the idea of being "Perlish". The *only* point of adhering to common idioms is to make it easier for other programmers to read and understand your code, but "Perlish" code is often *harder* for others to understand.
j_random_hacker
A: 

I don't know but ... well first of all you save a variable assignment in the second version of the loop. I can imagine that since $_ is used very often it should be somehow optimized. You could try to profile it, a very good Perl profiler is NYTProf 2 written by Tim Bunce.

Then, is it really worthy to optimize this small things? I don't think that a loop will make a difference. I suggest you to use the profiler to measure your performance and identify the real bottlenecks. Usually the speed problems are located in the 10% of the code that is running the 90% of the time (maybe will not be 10-90, but this is the "famous" ratio :P).

tunnuz
A: 

Using $_ is a Perl idiom, which shows the seasoned programmer that the "current context" is used. Also, many functions take $_ by default as parameter, thus making code more concise.

Some might also just argue, that "it was hard to write, it should be hard to read".

David Schmitt
Decided not to -1 you, but anybody who really believes that "if it was hard to write, it should be hard to read" is not a professional programmer. Unfortunately such childish attitudes seem all too common in the Perl community.
j_random_hacker
@j_random_hacker: What do you expect from a language(-community) whose first motto is TIMTOWTDI? Coincidentally I stopped programming Perl around the time I noticed that readable Perl looks more like $@{Java} than "idiomatic" Perl.
David Schmitt
@David: Yes, I totally agree. I often wish I could go back in time and learn Python instead of Perl, since it's too hard to "break the Perl habit" now that I know many of the language's gratuitous inconsistencies off by heart.
j_random_hacker
+11  A: 

Even know Premature optimisation is the root of all evil

{
  local $\ = "\n";
  print foreach @numbers;
}

but some expectations can be wrong. Test is little bit weird because output can make some weird side-effects and order can be important.

#!/usr/bin/env perl
use strict;
use warnings;
use Benchmark qw(:all :hireswallclock);

use constant Numbers => 10000;

my @numbers = (1 .. Numbers);

sub no_out (&) {
    local *STDOUT;
    open STDOUT, '>', '/dev/null';
    my $result  = shift()->();
    close STDOUT;
    return $result;
};

my %tests = (
    loop1 => sub {
     foreach my $current (@numbers) {
      print "$current\n";
     }
    },
    loop2 => sub {
     foreach (@numbers) {
      print "$_\n";
     }

    },
    loop3 => sub {
     local $\ = "\n";
     print foreach @numbers;
     }
);

sub permutations {
    return [
     map {
      my $a = $_;
      my @f = grep {$a ne $_} @_;
      map { [$a, @$_] } @{ permutations( @f ) }
      } @_
     ]
     if @_;
    return [[]];
}

foreach my $p ( @{ permutations( keys %tests ) } ) {
    my $result = {
     map {
      $_ => no_out { sleep 1; countit( 2, $tests{$_} ) }
      } @$p
    };

    cmpthese($result);
}

One can expect that loop2 should be faster than loop1

       Rate loop2 loop1 loop3
loop2 322/s    --   -2%  -34%
loop1 328/s    2%    --  -33%
loop3 486/s   51%   48%    --
       Rate loop2 loop1 loop3
loop2 322/s    --   -0%  -34%
loop1 323/s    0%    --  -34%
loop3 486/s   51%   50%    --
       Rate loop2 loop1 loop3
loop2 323/s    --   -0%  -33%
loop1 324/s    0%    --  -33%
loop3 484/s   50%   49%    --
       Rate loop2 loop1 loop3
loop2 317/s    --   -3%  -35%
loop1 328/s    3%    --  -33%
loop3 488/s   54%   49%    --
       Rate loop2 loop1 loop3
loop2 323/s    --   -2%  -34%
loop1 329/s    2%    --  -33%
loop3 489/s   51%   49%    --
       Rate loop2 loop1 loop3
loop2 325/s    --   -1%  -33%
loop1 329/s    1%    --  -32%
loop3 488/s   50%   48%    --

Sometimes I observed consistently loop1 about 15%-20% faster than loop2 but I can't determine why.

I was observed generated byte-code for loop1 and loop2 and there is difference only one when creating my variable. This variable interior is not allocated and also not copied thus this operation is very cheap. Difference comes I think only from "$_\n" construct which is not cheap. These loops should be very similar

for (@numbers) {
  ...
}

for my $a (@numbers) {
  ...
}

but this loop is more expensive

for (@numbers) {
  my $a = $_;
  ...
}

and also

print "$a\n";

is more expensive than

print $a, "\n";
Hynek -Pichi- Vychodil
I more interested in the general idea of using $_ rather than printing...Anyways I would love if you could explain a bit more the above code :D
mandel
This is Perlish :P
tunnuz
Main idea to use $_ is avoid my variable creation and allocation. These operation consume some time amount anyway using `for my $a (@numbers) {}` doesn't allocate memory for array members (beware of this, changing $a will affect @numbers).
Hynek -Pichi- Vychodil
If you want change loop variable inside loop without affecting array you can use `for (@numbers) {my $a = $_}` which cause copying and also allocation of course. Another optimization trick is avoid string creation by "$_\n". Alternative way is print $_, "\n";
Hynek -Pichi- Vychodil
I've run your test on "perl, v5.10.0 built for MSWin32-x86-multi-thread". There is no difference between the loops (I think `print` eats all the difference).
J.F. Sebastian
If you test on Windows, you will have to change from "`open STDOUT, '>', '/dev/null';`" to "`open STDOUT, '>', 'nul';`"
Brad Gilbert
`$_` has to be localized, so there is no real time savings in loop2.
Chas. Owens
If you want to see why loop1 is slightly faster than loop2, have a look at my answer http://stackoverflow.com/questions/486949/is-more-efficient-than-a-named-variable-in-perls-foreach/1261487#1261487
Brad Gilbert
@Gilbert: Nice, it seems that you are right. But anyway loop3 is still consistently about 3-5% faster even I change loop1 code to local $\ = "\n"; foreach my $current (@numbers) { print $current; } It seems that print of $_ is cheaper than $_ localization in loop.
Hynek -Pichi- Vychodil
+6  A: 

Benchmark:

use Benchmark qw(timethese cmpthese);

my $iterations = 500000;     

cmpthese( $iterations,
  {
    'Loop 1' => 'my @numbers = (1,3,5,7,9);
    foreach my $current (@numbers)
    {
      print "$current\n";
    }', 

    'Loop 2' => 'my @numbers = (1,3,5,7,9);
    foreach (@numbers)
    {
      print "$_\n";
    }'
  }
);

Output:

         Rate     Loop 2 Loop 1
Loop 2  23375/s     --    -1%
Loop 1  23546/s     1%     --

I've run it a couple of times with varying results. I think it's safe to say that there isn't much of a difference.

drby
well 1% in a loop with just 5 iteration is quite a lot for just using $_, that per loop is a posible difference...
mandel
Benchmarking is not as easy as some think. See, your results are about 100k numbers per second and mine http://stackoverflow.com/questions/486949/which-loop-is-more-efficient-in-perl#487040 shows 3220k/s. You measured your terminal IO speed mostly :)
Hynek -Pichi- Vychodil
+2  A: 

I more interested in the general idea of using $_ rather than printing...

As a side note, Perl Best Practices is a good place to go to if you want to start learning which idioms to avoid and why. I don't agree with everything he writes, but he's spot on most times.

Joe Casadonte
+1  A: 

Running the two options through "perl -MO=Concise,-terse,-src test.pl", results in these two OpTrees:

for my $n (@num){ ... }

LISTOP (0x9c08ea0) leave [1] 
    OP (0x9bad5e8) enter 
# 5: my @num = 1..9;
    COP (0x9b89668) nextstate 
    BINOP (0x9b86210) aassign [4] 
        UNOP (0x9bacfa0) null [142] 
            OP (0x9b905e0) pushmark 
            UNOP (0x9bad5c8) rv2av 
                SVOP (0x9bacf80) const [5] AV (0x9bd81b0) 
        UNOP (0x9b895c0) null [142] 
            OP (0x9bd95f8) pushmark 
            OP (0x9b4b020) padav [1] 
# 6: for my $n (@num){
    COP (0x9bd12a0) nextstate 
    BINOP (0x9c08b48) leaveloop 
        LOOP (0x9b1e820) enteriter [6] 
            OP (0x9b1e808) null [3] 
            UNOP (0x9bd1188) null [142] 
                OP (0x9bb5ab0) pushmark 
                OP (0x9b8c278) padav [1] 
        UNOP (0x9bdc290) null 
            LOGOP (0x9bdc2b0) and 
                OP (0x9b1e458) iter 
                LISTOP (0x9b859b8) lineseq 
# 7:   say $n;
                    COP (0x9be4f18) nextstate 
                    LISTOP (0x9b277c0) say 
                        OP (0x9c0edd0) pushmark 
                        OP (0x9bda658) padsv [6] # <===
                    OP (0x9b8a2f8) unstack 

for(@num){ ... }

LISTOP (0x8cdbea0) leave [1] 
    OP (0x8c805e8) enter 
# 5: my @num = 1..9;
    COP (0x8c5c668) nextstate 
    BINOP (0x8c59210) aassign [4] 
        UNOP (0x8c7ffa0) null [142] 
            OP (0x8ccc1f0) pushmark 
            UNOP (0x8c805c8) rv2av 
                SVOP (0x8c7ff80) const [7] AV (0x8cab1b0) 
        UNOP (0x8c5c5c0) null [142] 
            OP (0x8cac5f8) pushmark 
            OP (0x8c5f278) padav [1] 
# 6: for (@num){
    COP (0x8cb7f18) nextstate 
    BINOP (0x8ce1de8) leaveloop 
        LOOP (0x8bf1820) enteriter 
            OP (0x8bf1458) null [3] 
            UNOP (0x8caf2b0) null [142] 
                OP (0x8bf1808) pushmark 
                OP (0x8c88ab0) padav [1] 
            PADOP (0x8ca4188) gv  GV (0x8bd7810) *_ # <===
        UNOP (0x8cdbb48) null 
            LOGOP (0x8caf290) and 
                OP (0x8ce1dd0) iter 
                LISTOP (0x8c62aa8) lineseq 
# 7:   say $_;
                    COP (0x8cade88) nextstate 
                    LISTOP (0x8bf12d0) say 
                        OP (0x8cad658) pushmark 
                        UNOP (0x8c589b8) null [15] # <===
                            PADOP (0x8bfa7c0) gvsv  GV (0x8bd7810) *_ # <===
                    OP (0x8bf9a10) unstack 

I've added "<===" to mark the differences between the two.

If you notice there are actually more ops for the "for(@num){...}" version.

So if anything the "for(@num){...}" version is probably slower than "for my $n (@num){...}" version.

Brad Gilbert