views:

326

answers:

6

option A:

print $fh $hr->{'something'}, "|", $hr->{'somethingelse'}, "\n";

option B:

print $fh $hr->{'something'} . "|" . $hr->{'somethingelse'} . "\n";
+13  A: 

Unless you are executing millions of these statements, the performance difference will not matter. I really suggest concentrating on performance problems where they do exist - and the only way to find that out is to profile your application.

Premature optimization is something that Joel and Jeff had a podcast on, and whined about, for years. It's just a waste of time to try to optimize something until you KNOW that it's slow.

Alex
I am executing tens of thousands of them.
Trivial amount. Is this a performance critical app?
Chris Simmons
+3  A: 

UPDATE: I just ran my own test.

1,000,000 iterations of each version took each < 1 second.

10mm iterations of each version took an average of 2.35 seconds for list version vs. 2.1 seconds for string concat version

DVK
See my various benchmarking talks and maybe the benchmarking chapter in Mastering Perl to see why those numbers are meaningless.
brian d foy
OK, i'll bite. What are these talks and where can I access them? :)[ I do own a copy of M.P. and will re-read the chapter. ]I dno't take the benchmarks as holy grail, but ceterus parabus, they could be of some use and statistics don't always lie :)
DVK
@DVK: I would imagine he means here: http://www252.pair.com/comdog/
Telemachus
@DVK: I don't know where all my talks are. I just google for them when I need them. Seriously, I google for my own stuff.
brian d foy
+3  A: 

Have you actually tried profiling this? Only takes a few seconds.

On my machine, it appears that B is faster. However, you should really have a look at Pareto Analysis. You've already wasted far, far more time thinking about this question then you'd ever save in any program run. For problems as trivial as this (character substitution!), you should wait to care until you actually have a problem.

Chris Simmons
http://fetter.org/optimization.htmland your comment should be an 8th rule."8. If the time it takes you to think about optimization outweighs the gains you can possibly receive, optimization is over."
jsoverson
@jsoverson In very rare circumstances, this is not true. Shaving a hair of a nanosecond may mean the difference between working and not working for things involving parts in the real world (think pacemaker). The amount of time saved over the lifespan of the program may never reach the amount time spent making it work, but if that hair of a nanosecond is necessary, it doesn't matter. That said, you should start looking for how to save that tiny slice of time, you should start with profiling, etc.
Chas. Owens
@Chas: sure, but you don't use Perl in those cases because there's nothing that guarantees performance. However, you do not contradict what @jsoverson says; it's all about the benefit versus the cost. Working == benefit.
brian d foy
Another example for Chas. Owens' comment, that may be more applicable to Perl, may be a programmer investing a week to optimize an application before it is being demoed by a lot of CEOs who are interested in purchasing it; even if the total time saved is only a couple of minutes (a few seconds per CEO.)
Inshallah
@Chas If you're asking trivial micro optimization questions on Stack Overflow I hope to GOD you're not writing medical software.
Schwern
@Schwern I chose medical device due to recent exposure to a hospital. Anything involving the real world may require that sort of speed: robotics for example.
Chas. Owens
A: 

Of the three options, I would probably choose string interpolation first and switch to commas for expressions that cannot be interpolated. This, humorously enough, means that my default choice is the slowest of the bunch, but given that they are all so close to each other in speed and that disk speed is probably going to be slower than anything else, I don't believe changing the method has any real performance benefits.

As others have said, write the code, then profile the code, then examine the algorithms and data structures you have chosen that are in the slow parts of the code, and, finally, look at the implementation of the algorithms and data structures. Anything else is foolish micro-optimizing that wastes more time than it saves.

You may also want to read perldoc perlperf

           Rate string concat  comma
string 803887/s     --    -0%    -7%
concat 803888/s     0%     --    -7%
comma  865570/s     8%     8%     --
#!/usr/bin/perl

use strict;
use warnings;

use Carp;
use List::Util qw/first/;
use Benchmark;

sub benchmark {
    my $subs = shift;

    my ($k, $sub) = each %$subs;
    my $value = $sub->();
    croak "bad" if first { $value ne $_->() and print "$value\n", $_->(), "\n" } values %$subs;

    Benchmark::cmpthese -1, $subs;
}

sub fake_print {
    #this is, plus writing output to the screen is what print does
    no warnings;
    my $output = join $,, @_;
    return $output;
}

my ($x, $y) = ("a", "b");
benchmark {
    comma  => sub { return fake_print $x, "|", $y, "\n"     },
    concat => sub { return fake_print $x .  "|" . $y . "\n" },
    string => sub { return fake_print "$x|$y\n"             },
};
Chas. Owens
Your fake_print() might simulate the user level steps that the real print() goes through, but it does not do them in the same way and thus cannot be used for benchmarking the performance of print(). What you actually benchmarked is the difference between passing arguments as a list and concatenating them. There is also a small performance difference between calling a subroutine with one argument and many which further poisons the results. And, most importantly, by not calling print it misses the vital point that I/O swamps all other performance considerations.
Schwern
+4  A: 

Perl is a high-level language, and as such the statements you see in the source code don't map directly to what the computer is actually going to do. You might find that a particular implementation of perl makes one thing faster than the other, but that's no guarantee that another implementation might take away the advantage (although they try not to make things slower).

If you're worried about I/O speed, there are a lot more interesting and useful things to tweak before you start worrying about commas and periods. See, for instance, the discussion under Perl write speed mystery.

brian d foy
Brian, I'm afraid I must disagree with the latter point - I don't think he was worried about IO speed per se, since the actual output to the IO device would be 100% identical. Although I completely agree with overall tenor of the idea that there are MUCH more impactful things to optimize in an average Perl program than this specific syntactic difference.
DVK
@DVK: see his follow-up comment to his question. He wants to know which is faster.
brian d foy
+12  A: 

The answer is simple, it doesn't matter. As many folks have pointed out, this is not going to be your program's bottleneck. Optimizing this to even happen instantly is unlikely to have any effect on your performance. You must profile first, otherwise you are just guessing and wasting your time.

If we are going to waste time on it, let's at least do it right. Below is the code to do a realistic benchmark. It actually does the print and sends the benchmarking information to STDERR. You run it as perl benchmark.plx > /dev/null to keep the output from flooding your screen.

Here's 5 million iterations writing to STDOUT. By using both timethese() and cmpthese() we get all the benchmarking data.

$ perl ~/tmp/bench.plx 5000000 > /dev/null
Benchmark: timing 5000000 iterations of concat, list...
    concat:  3 wallclock secs ( 3.84 usr +  0.12 sys =  3.96 CPU) @ 1262626.26/s (n=5000000)
      list:  4 wallclock secs ( 3.57 usr +  0.12 sys =  3.69 CPU) @ 1355013.55/s (n=5000000)
            Rate concat   list
concat 1262626/s     --    -7%
list   1355014/s     7%     --

And here's 5 million writing to a temp file

$ perl ~/tmp/bench.plx 5000000
Benchmark: timing 5000000 iterations of concat, list...
    concat:  6 wallclock secs ( 3.94 usr +  1.05 sys =  4.99 CPU) @ 1002004.01/s (n=5000000)
      list:  7 wallclock secs ( 3.64 usr +  1.06 sys =  4.70 CPU) @ 1063829.79/s (n=5000000)
            Rate concat   list
concat 1002004/s     --    -6%
list   1063830/s     6%     --

Note the extra wallclock and sys time underscoring how what you're printing to matters as much as what you're printing.

The list version is about 5% faster (note this is counter to Pavel's logic underlining the futility of trying to just think this stuff through). You said you're doing tens of thousands of these? Let's see... 100k takes 146ms of wallclock time on my laptop (which has crappy I/O) so the best you can do here is to shave off about 7ms. Congratulations. If you spent even a minute thinking about this it will take you 40k iterations of that code before you've made up that time. This is not to mention the opportunity cost, in that minute you could have been optimizing something far more important.

Now, somebody's going to say "now that we know which way is faster we should write it the fast way and save that time in every program we write making the whole exercise worthwhile!" No. It will still add up to an insignificant portion of your program's run time, far less than the 5% you get measuring a single statement. Second, logic like that causes you to prioritize micro-optimizations over maintainability.

Oh, and its different in 5.8.8 as in 5.10.0.

$ perl5.8.8 ~/tmp/bench.plx 5000000 > /dev/null
Benchmark: timing 5000000 iterations of concat, list...
    concat:  3 wallclock secs ( 3.69 usr +  0.04 sys =  3.73 CPU) @ 1340482.57/s (n=5000000)
      list:  5 wallclock secs ( 3.97 usr +  0.06 sys =  4.03 CPU) @ 1240694.79/s (n=5000000)
            Rate   list concat
list   1240695/s     --    -7%
concat 1340483/s     8%     --

It might even change depending on what Perl I/O layer you're using and operating system. So the whole exercise is futile.

Micro-optimization is a fool's game. Always profile first and look to optimizing your algorithm. Devel::NYTProf is an excellent profiler.

#!/usr/bin/perl -w

use strict;
use warnings;
use Benchmark qw(timethese cmpthese);

#open my $fh, ">", "/tmp/test.out" or die $!;
#open my $fh, ">", "/dev/null" or die $!;
my $fh = *STDOUT;
my $hash = {
    foo => "something and stuff",
    bar => "and some other stuff"
};

select *STDERR;
my $r = timethese(shift || -3, {
    list => sub {
        print $fh $hash->{foo}, "|", $hash->{bar};
    },
    concat => sub {
        print $fh $hash->{foo}. "|". $hash->{bar};
    },
});
cmpthese($r);
Schwern
My system keeps telling that list is slower, even when I run your benchmark. This also shows that you can't choose one of these variants: while on some system the former is faster, on others the latter is.
Pavel Shved
Please, stop wasting on putting up experiments like this. They've been done millions of times. The performance difference doesn't matter.
Alex