views:

783

answers:

9

I'm optimizing some frequently run Perl code (once per day per file).

Do comments slow Perl scripts down? My experiments lean towards no:

use Benchmark;
timethese(20000000, {
    'comments' => '$b=1;
# comment  ... (100 times)
', 'nocomments' => '$b=1;'});

Gives pretty much identical values (apart from noise).

Benchmark: timing 10000000 iterations of comments, nocomments...
  comments:  1 wallclock secs ( 0.53 usr +  0.00 sys =  0.53 CPU) @ 18832391.71/s (n=10000000)
nocomments:  0 wallclock secs ( 0.44 usr +  0.00 sys =  0.44 CPU) @ 22935779.82/s (n=10000000)

Benchmark: timing 20000000 iterations of comments, nocomments...
  comments:  0 wallclock secs ( 0.86 usr + -0.01 sys =  0.84 CPU) @ 23696682.46/s (n=20000000)
nocomments:  1 wallclock secs ( 0.90 usr +  0.00 sys =  0.90 CPU) @ 22099447.51/s (n=20000000)

I get similar results if I run the comments and no-comments versions as separate Perl scripts.

It seems counter-intuitive though, if nothing else the interpreter needs to read the comments into memory every time.

+8  A: 

Perl compiles a script and then executes it. Comments marginally slow the compile phase, but have zero effect on the run phase.

Jonathan Leffler
+15  A: 

Runtime performance? No.

Parsing and lexing performance? Yes, of course.

Since Perl tends to parse and lex on the fly, then comments will affect "start up" performance.

Will they affect it noticably? Unlikely.

Will Hartung
A: 

I would expect that the one comment would only get parsed once, not multiple times in the loop, so I doubt it is a valid test.

I would expect that comments would slightly slow compilation, but I expect it would be too minor to bother removing them.

Rob Prouse
+1  A: 

From Paul Tomblins comment:

Doesn't perl do some sort of on-the-fly compilation? Maybe the comments get discarded early? –

Yes Perl does.

It is a programming language in between compiled and interpreted. The code gets compiled on the fly and then run. the comments usually don't make any difference. The most it would probably effect is when it is initially parsing the file line by line and pre compiling it, you might see a nano second difference.

stephenbayer
A: 

Do Perl comments slow a script down? Well, parsing it, yes. Executing it after parsing it? No. How often is a script parsed? Only once, so if you have a comment within a for loop, the comment is discarded by the parses once, before the script even runs, once it started running, the comment is already gone (and the script is not stored as script internally by Perl), thus no matter how many times the for loop repeats, the comment won't have an influence. How fast can the parser skip over comments? The way Perl comments are done, very fast, thus I doubt you will notice. You will notice a higher start-up time if you have 5 lines of code and between each line 1 Mio lines of comments... but how likely is that and of what use would a comment that large be?

Mecki
+7  A: 

Perl is not a scripting language in the same sense that shell scripts are. The interpreter does not read the file line by line. The execution of a Perl program is done in two basic stages: compilation and runtime [1]. During the compilation stage the source code is parsed and converted into bytecode. During the runtime stage the bytecode is executed on a virtual machine.

Comments will slow down the parsing stage but the difference is negligible compared to the time required to parse the script itself (which is already very small for most programs). About the only time you're really concerned with parsing time is in a webserver environment where the program could be called many times per second. mod_perl exists to solve this problem.

You're using Benchmark. That's good! You should be looking for ways to improve the algorithm -- not micro-optimizing. Devel::DProf might be helpful to find any hot spots. You absolutely should not strip comments in a misguided attempt to make your program faster. You'll just make it unmaintainable.


[1] This is commonly called "just in time" compilation. Perl actually has several more stages like INIT and END that don't matter here.

Michael Carman
Devel::DProf is the old brokenness with only subroutine level profiling. Devel::NYTProf is the new hotness with finer granularity..
brian d foy
+3  A: 

The point is: optimize bottlenecks. Reading in a file consists of:

  • opening the file,
  • reading in its contents,
  • closing the file,
  • parsing the contents.

Of these steps, reading is the fastest part by far (I am not sure about closing, it is a syscall, but you don't have to wait for it to finish). Even if it is 10% of the whole thing (which is is not, I think), then reducing it by half only gives 5% improved performance, at the cost of missing comments (which is a very bad thing). For the parser, throwing away a line that begins with # is not a tangible slowdown. And after that, the comments are gone, so there can be no slowdown.

Now, imagine that you could actually improve the "reading in the script" part by 5% through stripping all comments (which is a really optimistic estimate, see above). How big is the share of "reading in the script" in overall time consumption of the script? Depends on how much it does, of course, but since perl scripts usually read at least one more file, it is 50% at most, but since perl scripts usually do something more, an honest estimate will bring this down to something in the range of 1%. So, the expected efficiency improvement by stripping all comments is at most (very optimistic) 2.5%, but really closer to 0.05%. And then, those where it actually gives more than 1% are already fast since they do almost nothing, so you are again optimizing at the wrong point.

Concluding, optimize bottlenecks.

Svante
If I were to add an entry, I'd point out that end-of-line comments are by far the easiest to discard. Perldoc are probably the next easiest: whole line, can't be nested (when you say =cut, it's done.), definite block of lines...
Axeman
ACtually, reading a file starts with finding the file. If Perl has to search through a long @INC, that could be significant. See, for instance, http://www.perl.com/lpt/a/2005/12/21/a_timely_start.html
brian d foy
Yes, but here I assumed a direct invocation including a path. If some cron job does not specify the script's location, then that is a bottleneck actually worth optimizing.
Svante
+11  A: 

Perl is a just-in-time compiled language, so comments and POD have no effect on run-time performance.

Comments and POD have a minuscule effect on compile-time, but they're so easy and fast for Perl to parse it's almost impossible to measure the performance hit. You can see this for yourself by using the -c flag to just compile.

On my Macbook, a Perl program with 2 statements and 1000 lines of 70 character comments takes the same time to compile as one with 1000 lines of empty comments as one with just 2 print statements. Be sure to run each benchmark twice to allow your OS to cache the file, otherwise what you're benchmarking is the time to read the file from the disk.

If startup time is a problem for you, it's not because of comments and POD.

Schwern
So the answer I'm going with is barely, thanks all.
Dave
+2  A: 

The Benchmark module is useless in this case. It's only measuring the times to run the code over and over again. Since your code doesn't actually do anything, most of it is optimized it away. That's why you're seeing it run 22 million times a second.

I have almost on entire chapter about this in Mastering Perl. The error of measurement in the Benchmark technique is about 7%. Your benchmark numbers are well within that, so there's virtually no difference.

brian d foy