option A:
print $fh $hr->{'something'}, "|", $hr->{'somethingelse'}, "\n";
option B:
print $fh $hr->{'something'} . "|" . $hr->{'somethingelse'} . "\n";
option A:
print $fh $hr->{'something'}, "|", $hr->{'somethingelse'}, "\n";
option B:
print $fh $hr->{'something'} . "|" . $hr->{'somethingelse'} . "\n";
Unless you are executing millions of these statements, the performance difference will not matter. I really suggest concentrating on performance problems where they do exist - and the only way to find that out is to profile your application.
Premature optimization is something that Joel and Jeff had a podcast on, and whined about, for years. It's just a waste of time to try to optimize something until you KNOW that it's slow.
UPDATE: I just ran my own test.
1,000,000 iterations of each version took each < 1 second.
10mm iterations of each version took an average of 2.35 seconds for list version vs. 2.1 seconds for string concat version
Have you actually tried profiling this? Only takes a few seconds.
On my machine, it appears that B is faster. However, you should really have a look at Pareto Analysis. You've already wasted far, far more time thinking about this question then you'd ever save in any program run. For problems as trivial as this (character substitution!), you should wait to care until you actually have a problem.
Of the three options, I would probably choose string interpolation first and switch to commas for expressions that cannot be interpolated. This, humorously enough, means that my default choice is the slowest of the bunch, but given that they are all so close to each other in speed and that disk speed is probably going to be slower than anything else, I don't believe changing the method has any real performance benefits.
As others have said, write the code, then profile the code, then examine the algorithms and data structures you have chosen that are in the slow parts of the code, and, finally, look at the implementation of the algorithms and data structures. Anything else is foolish micro-optimizing that wastes more time than it saves.
You may also want to read perldoc perlperf
Rate string concat comma string 803887/s -- -0% -7% concat 803888/s 0% -- -7% comma 865570/s 8% 8% --
#!/usr/bin/perl
use strict;
use warnings;
use Carp;
use List::Util qw/first/;
use Benchmark;
sub benchmark {
my $subs = shift;
my ($k, $sub) = each %$subs;
my $value = $sub->();
croak "bad" if first { $value ne $_->() and print "$value\n", $_->(), "\n" } values %$subs;
Benchmark::cmpthese -1, $subs;
}
sub fake_print {
#this is, plus writing output to the screen is what print does
no warnings;
my $output = join $,, @_;
return $output;
}
my ($x, $y) = ("a", "b");
benchmark {
comma => sub { return fake_print $x, "|", $y, "\n" },
concat => sub { return fake_print $x . "|" . $y . "\n" },
string => sub { return fake_print "$x|$y\n" },
};
Perl is a high-level language, and as such the statements you see in the source code don't map directly to what the computer is actually going to do. You might find that a particular implementation of perl
makes one thing faster than the other, but that's no guarantee that another implementation might take away the advantage (although they try not to make things slower).
If you're worried about I/O speed, there are a lot more interesting and useful things to tweak before you start worrying about commas and periods. See, for instance, the discussion under Perl write speed mystery.
The answer is simple, it doesn't matter. As many folks have pointed out, this is not going to be your program's bottleneck. Optimizing this to even happen instantly is unlikely to have any effect on your performance. You must profile first, otherwise you are just guessing and wasting your time.
If we are going to waste time on it, let's at least do it right. Below is the code to do a realistic benchmark. It actually does the print and sends the benchmarking information to STDERR. You run it as perl benchmark.plx > /dev/null
to keep the output from flooding your screen.
Here's 5 million iterations writing to STDOUT. By using both timethese()
and cmpthese()
we get all the benchmarking data.
$ perl ~/tmp/bench.plx 5000000 > /dev/null
Benchmark: timing 5000000 iterations of concat, list...
concat: 3 wallclock secs ( 3.84 usr + 0.12 sys = 3.96 CPU) @ 1262626.26/s (n=5000000)
list: 4 wallclock secs ( 3.57 usr + 0.12 sys = 3.69 CPU) @ 1355013.55/s (n=5000000)
Rate concat list
concat 1262626/s -- -7%
list 1355014/s 7% --
And here's 5 million writing to a temp file
$ perl ~/tmp/bench.plx 5000000
Benchmark: timing 5000000 iterations of concat, list...
concat: 6 wallclock secs ( 3.94 usr + 1.05 sys = 4.99 CPU) @ 1002004.01/s (n=5000000)
list: 7 wallclock secs ( 3.64 usr + 1.06 sys = 4.70 CPU) @ 1063829.79/s (n=5000000)
Rate concat list
concat 1002004/s -- -6%
list 1063830/s 6% --
Note the extra wallclock and sys time underscoring how what you're printing to matters as much as what you're printing.
The list version is about 5% faster (note this is counter to Pavel's logic underlining the futility of trying to just think this stuff through). You said you're doing tens of thousands of these? Let's see... 100k takes 146ms of wallclock time on my laptop (which has crappy I/O) so the best you can do here is to shave off about 7ms. Congratulations. If you spent even a minute thinking about this it will take you 40k iterations of that code before you've made up that time. This is not to mention the opportunity cost, in that minute you could have been optimizing something far more important.
Now, somebody's going to say "now that we know which way is faster we should write it the fast way and save that time in every program we write making the whole exercise worthwhile!" No. It will still add up to an insignificant portion of your program's run time, far less than the 5% you get measuring a single statement. Second, logic like that causes you to prioritize micro-optimizations over maintainability.
Oh, and its different in 5.8.8 as in 5.10.0.
$ perl5.8.8 ~/tmp/bench.plx 5000000 > /dev/null
Benchmark: timing 5000000 iterations of concat, list...
concat: 3 wallclock secs ( 3.69 usr + 0.04 sys = 3.73 CPU) @ 1340482.57/s (n=5000000)
list: 5 wallclock secs ( 3.97 usr + 0.06 sys = 4.03 CPU) @ 1240694.79/s (n=5000000)
Rate list concat
list 1240695/s -- -7%
concat 1340483/s 8% --
It might even change depending on what Perl I/O layer you're using and operating system. So the whole exercise is futile.
Micro-optimization is a fool's game. Always profile first and look to optimizing your algorithm. Devel::NYTProf is an excellent profiler.
#!/usr/bin/perl -w
use strict;
use warnings;
use Benchmark qw(timethese cmpthese);
#open my $fh, ">", "/tmp/test.out" or die $!;
#open my $fh, ">", "/dev/null" or die $!;
my $fh = *STDOUT;
my $hash = {
foo => "something and stuff",
bar => "and some other stuff"
};
select *STDERR;
my $r = timethese(shift || -3, {
list => sub {
print $fh $hash->{foo}, "|", $hash->{bar};
},
concat => sub {
print $fh $hash->{foo}. "|". $hash->{bar};
},
});
cmpthese($r);