views:

2009

answers:

10

Let us ignore for a moment Damian Conway's best practice of no more than three positional parameters for any given subroutine.

Is there any difference between the two examples below in regards to performance or functionality?

Using shift:

sub do_something_fantastical {
    my $foo   = shift;
    my $bar   = shift;
    my $baz   = shift;
    my $qux   = shift;
    my $quux  = shift;
    my $corge = shift;
}

Using @_:

sub do_something_fantastical {
    my ($foo, $bar, $baz, $qux, $quux, $corge) = @_;
}

Provided that both examples are the same in terms of performance and functionality, what do people think about one format over the other? Obviously the example using @_ is fewer lines of code, but isn't it more legible to use shift as shown in the other example? Opinions with good reasoning are welcome.

+8  A: 

I would imagine the shift example is slower than using @_ because it's 6 function calls instead of 1. Whether or not it's noticeable or even measurable is a different question. Throw each in a loop of 10k iterations and time them.

As for aesthetics, I prefer the @_ method. It seems like it would be too easy to mess up the order of the variables using the shift method with an accidental cut and paste. Also, I've seen many people do something like this:

sub do_something {
   my $foo = shift;
   $foo .= ".1";

   my $baz = shift;
   $baz .= ".bak";

   my $bar = shift;
   $bar .= ".a";
}

This, IMHO, is very nasty and could easily lead to errors, e.g. if you cut the baz block and paste it under the bar block. I'm all for defining variables near where they're used, but I think defining the passed in variables at the top of the function takes precedence.

Joe Casadonte
Saying anything in Perl is faster than anything else due to function calls is going to be inaccurate. Assignment from a list is going to carry a number of (C-level) function calls, as will doing a number of shift()s.
Chris Lutz
+3  A: 

Usually I use the first version. This is because I usually need to have error checking together with the shifts, which is easier to write. Say,

sub do_something_fantastical {
    my $foo   = shift || die("must have foo");
    my $bar   = shift || 0;  # $bar is optional
    # ...
}
PolyThinker
But there's still a limit to how many optional parameters you can sensibly deal with in this fashion, before using a hash would have made more sense. With an array, I prefer to use the @_ method, followed by $bar ||= 0 as required.
RET
For what it's worth, you should consider using "shift // die" instead of "shift || die" - unless you want to die when foo is the empty string or 0 or the string "0".
Chris Lutz
+3  A: 

I prefer to unpack @_ as a list (your second example). Though, like everything in Perl, there are instances where using shift can be useful. For example passthru methods that are intended to be overridden in a base class but you want to make sure that things still work if they are not overridden.


package My::Base;
use Moose;
sub override_me { shift; return @_; }

notbenh
+12  A: 

At least on my systems, it seems to depend upon the version of Perl and architecture:

#!/usr/bin/perl -w
use strict;
use warnings;
use autodie;

use Benchmark qw( cmpthese );

print "Using Perl $] under $^O\n\n";

cmpthese(
    -1,
    {
        shifted   => 'call( \&shifted )',
        list_copy => 'call( \&list_copy )',
    }
);

sub call {
    $_[0]->(1..6);  # Call our sub with six dummy args.
}

sub shifted {
    my $foo   = shift;
    my $bar   = shift;
    my $baz   = shift;
    my $qux   = shift;
    my $quux  = shift;
    my $corge = shift;

    return;
}

sub list_copy {
    my ($foo, $bar, $baz, $qux, $quux, $corge) = @_;
    return;
}

Results:

Using Perl 5.008008 under cygwin

              Rate   shifted list_copy
shifted   492062/s        --      -10%
list_copy 547589/s       11%        --


Using Perl 5.010000 under MSWin32

              Rate list_copy   shifted
list_copy 416767/s        --       -5%
shifted   436906/s        5%        --


Using Perl 5.008008 under MSWin32

              Rate   shifted list_copy
shifted   456435/s        --      -19%
list_copy 563106/s       23%        --

Using Perl 5.008008 under linux

              Rate   shifted list_copy
shifted   330830/s        --      -17%
list_copy 398222/s       20%        --

So it looks like list_copy is usually 20% faster than shifting, except under Perl 5.10, where shifting is actually slightly faster!

Note that these were quickly derived results. Actual speed differences will be bigger than what's listed here, since Benchmark also counts the time taken to call and return the subroutines, which will have a moderating effect on the results. I haven't done any investigation to see if Perl is doing any special sort of optimisation. Your mileage may vary.

Paul

pjf
I'd expect which is faster to depend on the number of parameters, with more parameters favoring the list approach (same thing happens with foo($foo{bar}, $foo{baz}, ...) vs. foo(@foo{qw/bar baz .../}) )
ysth
There was a regression of sorts for 5.10.0 for @_ in list context. Which was fixed in 5.10.1
Brad Gilbert
+23  A: 

There's a functional difference. The shift modifies @_, and the assignment from @_ does not. If you don't need to use @_ afterward, that difference probably doesn't matter to you. I try to always use the list assignment, but I sometimes use shift.

However, if I start off with shift, like so:

 my( $param ) = shift;

I often create this bug:

 my( $param, $other_param ) = shift;

That's because I don't use shift that often, so I forget to get over to the right hand side of the assignment to change that to @_. That's the point of the best practice in not using shift. I could make separate lines for each shift as you did in your example, but that's just tedious.

brian d foy
+2  A: 

I prefer using

sub do_something_fantastical {
    my ( $foo, $bar, $baz, $qux, $quux, $corge ) = @_;
}

Because it is more readable. When this code is not called to much often it is worth way. In very rare cases you want make function called often and than use @_ directly. It is effective only for very short functions and you must be sure that this function will not evolve in future (Write once function). I this case I benchmarked in 5.8.8 that for single parameter is shift faster than $_[0] but for two parameters using $_[0] and $_[1] is faster than shift, shift.

sub fast1 { shift->call(@_) }

sub fast2 { $_[0]->call("a", $_[1]) }

But back to your question. I also prefer @_ assignment in one row over shifts for many parameters in this way

sub do_something_fantastical2 {
    my ( $foo, $bar, $baz, @rest ) = @_;
    ...
}

When Suspect @rest will not be to much big. In other case

sub raise {
    my $inc = shift;
    map {$_ + $inc} @_;
}

sub moreSpecial {
    my ($inc, $power) = (shift(), shift());
    map {($_ + $inc) ** $power} @_;
}

sub quadratic {
    my ($a, $b, $c) = splice @_, 0, 3;
    map {$a*$_*$_ + $b*$_ + $c} @_;
}

In rarely cases I need tail call optimization (by hand of course) then I must work directly with @_, than for short function is worth.

sub _switch    #(type, treeNode, transform[, params, ...])
{
    my $type = shift;
    my ( $treeNode, $transform ) = @_;
    unless ( defined $type ) {
     require Data::Dumper;
     die "Broken node: " . Data::Dumper->Dump( $treeNode, ['treeNode'] );
    }
    goto &{ $transform->{$type} }   if exists $transform->{$type};
    goto &{ $transform->{unknown} } if exists $transform->{unknown};
    die "Unknown type $type";
}

sub switchExTree    #(treeNode, transform[, params, ...])
{
    my $treeNode = $_[0];
    unshift @_, $treeNode->{type};    # set type
    goto &_switch;                    # tail call
}

sub switchCompact                     #(treeNode, transform[, params, ...])
{
    my $treeNode = $_[0];
    unshift @_, (%$treeNode)[0];      # set type given as first key
    goto &_switch;                    # tail call
}

sub ExTreeToCompact {
    my $tree = shift;
    return switchExTree( $tree, \%transformExTree2Compact );
}

sub CompactToExTree {
    my $tree = shift;
    return switchCompact( $tree, \%transformCompact2ExTree );
}

Where %transformExTree2Compact and %transformCompact2ExTree are hashes with type in key and code ref in value which can tail call switchExTree or switchCompact it selfs. But this approach is rarely really need and must keep less worth college's fingers off.

In conclusion, readability and maintainability is must especially in perl and assignment of @_ in one row is better. If you want set defaults you can do it just after it.

Hynek -Pichi- Vychodil
I read somewhere that modifying @_ will confuse the debugger.
Brad Gilbert
I'm using Perl professionally for seven years and have written hundreds of thousands LOC in Perl but never have used debugger. I haven't found it useful. Much more useful is modularizing code, (unit and component) testing and when need debugging, place Dumper.
Hynek -Pichi- Vychodil
A: 

sub _switch #(type, treeNode, transform[, params, ...])

{

my $type = shift;
my ( $treeNode, $transform ) = @_;
unless ( defined $type ) {
    require Data::Dumper;
    die "Broken node: " . Data::Dumper->Dump( $treeNode, ['treeNode'] );
}
goto &{ $transform->{$type} }   if exists $transform->{$type};
goto &{ $transform->{unknown} } if exists $transform->{unknown};
die "Unknown type $type";

}

i dont sure about this patch of code either it will work or not;

what is unless tag?
A: 

I suspect if you're doing the (rough) equivalent of:

push @bar, shift @_ for (1 :: $big_number);

Then you're doing something wrong. I amost always use the my ($foo, $bar) = @_; form cos I've shot myself in the foot using the latter a few too many times ...

singingfish
A: 

I prefer

sub factorial{
  my($n) = @_;

  ...

}

For the simple reason that Komodo will then be able to tell me what the arguments are, when I go to use it.

Brad Gilbert
A: 

The best way, IMHO, is a slight mixture of the two, as in the new function in a module:

our $Class;    
sub new {
    my $Class = shift;
    my %opts = @_;
    my $self = \%opts;
    # validate %opts for correctness
    ...
    bless $self, $Class;
}

Then, all calling arguments to the constructor are passed as a hash, which makes the code much more readable than just a list of parameters.

Plus, like brian said, the @_ variable is unmodified, which can be useful in unusual cases.

xcramps