views:

135

answers:

3

I've just been caught out when using someone else's API in conjunction with the default variable $_

foreach (@rps_server_details) {
    @server_data = ();
    @server_data = split(/,/);
    @$esp_hosts = ();
    $filters{server_name} = $server_data[0];
    print "--->$_<--\n";
    $esp_hosts = $esp->get_hosts(fields => $fields, %filters) || die "$@";
    print "--->$_<--\n";

The output for this is:

--->igrid8873.someone.com,app_10<--
Use of uninitialized value in concatenation (.) or string at ./rps_inv_lookup.pl line 120.
---><--

Specifying my own loop variable instead of relying on $_ fixes the problem.

Am I just being naive by using $_ in conjunction with an API someone else has written? Or is this a bug in that API module?

+6  A: 

I would say it's:

  1. a violation of best practices on your part (always use as-local-as possible variable scope and avoid using $_ due to just the issue your encountered)

  2. coupled with a bug in the API caused by the same violation of the best practices as well as not localizing the special variable with local $_ as proscribed by perldoc perlvar.

In addition to perldoc, the API violates Perl Best Practices (as in Conway's book's rules):

Section 5.6. Localizing Punctuation Variables

If you're forced to modify a punctuation variable, localize it.

The problems described earlier under "Localization can also crop up whenever you're forced to change the value in a punctuation variable (often in I/O operations). All punctuation variables are global in scope. They provide explicit control over what would be completely implicit behaviours in most other languages: output buffering, input line numbering, input and output line endings, array indexing, et cetera.

It's usually a grave error to change a punctuation variable without first localizing it. Unlocalized assignments can potentially change the behaviour of code in entirely unrelated parts of your system, even in modules you did not write yourself but are merely using.

Using local is the cleanest and most robust way to temporarily change the value of a global variable. It should always be applied in the smallest possible scope, so as to minimize the effects of any "ambient behaviour" the variable might control:

Here's full perldoc perlvar documentation as well - search for the word "nasty_break" in the web page (I couldn't find direct in-page link but it's close to the start of the page)

You should be very careful when modifying the default values of most special variables described in this document. In most cases you want to localize these variables before changing them, since if you don't, the change may affect other modules which rely on the default values of the special variables that you have changed. This is one of the correct ways to read the whole file at once:

  1. open my $fh, "<", "foo" or die $!;
  2. local $/; # enable localized slurp mode
  3. my $content = ;
  4. close $fh;

But the following code is quite bad:

  1. open my $fh, "<", "foo" or die $!;
  2. undef $/; # enable slurp mode
  3. my $content = ;
  4. close $fh;

since some other module, may want to read data from some file in the default "line mode", so if the code we have just presented has been executed, the global value of $/ is now changed for any other code running inside the same Perl interpreter.

Usually when a variable is localized you want to make sure that this change affects the shortest scope possible. So unless you are already inside some short {} block, you should create one yourself. For example:

  1. my $content = '';
  2. open my $fh, "<", "foo" or die $!;
  3. {
  4. local $/;
  5. $content = ;
  6. }
  7. close $fh;

Here is an example of how your own code can go broken:

  1. for (1..5){
  2. nasty_break();
  3. print "$_ ";
  4. }
  5. sub nasty_break {
  6. $_ = 5;
  7. # do something with $_
  8. }

You probably expect this code to print:

  1. 1 2 3 4 5

but instead you get:

  1. 5 5 5 5 5

Why? Because nasty_break() modifies $_ without localizing it first. The fix is to add local():

  1. local $_ = 5;
DVK
I would not say just using $_ is not best practice. For short scripts its just to useful. If you may not use it why is it there?
Peer Stritzinger
@Peer - because short scripts have a nasty tendency to morph into more complicated creature shich DO use other modules and start being succeptibe to the same issues.
DVK
@Peer - "If you may not use it why is it there?". Nobody said you **may not** use it. Merely that oftentimes you **should not** because using it - like a gaggle of other features - is a LOT more liable to producing buggy and/or hard to read and hard to maintain code.
DVK
@DVK the short scripts morphing into more complicated programs is all to true. Maybe we can agree on staying alert of the balance point when the "script" grows. I use Perl both for getting something done quickly (heavy usage of $_) and for bigger processing (less and less $_ and if only in toplevel loops). Being able to use $_ for some toplevel while(<>) loops makes it easier to always have `use strict` and warnings on even for hacks IMHO
Peer Stritzinger
That's exactly what happened to my script. Started off as a five line script, quickly written to pluck out some info from a massive amount of data. Then it gradually morphed and finished up using an existing API to do some lookups into a database.
Rob Wells
+9  A: 

It is a bug in the API. If you use $_ in a function it is important to add a

local($_);

inside the function to avoid clobbering the caller's $_, or otherwise avoid using $_in a library function to be called by others.

If you can limit yoursel to Perl versions > 5.9.1 then you can also make $_ lexical which makes it easier to understand than localwith

my $_;

But this will break on earlier versions of Perl.

From man perlvar:

As $_ is a global variable, this may lead in some cases to unwanted side-effects. As of perl 5.9.1, you can now use a lexical version of $_ by declaring it in a file or in a block with "my". Moreover, declaring "our $_" restores the global $_ in the current scope.

Peer Stritzinger
Actually, I remember that there is a problem with `local $_` and tied variables. AFAIK, The safest thing to do is to use `local *_` in the API.
Sinan Ünür
+2  A: 
foreach (@rps_server_details) {
    @server_data = ();
    @server_data = split(/,/);
    @$esp_hosts = ();
    $filters{server_name} = $server_data[0];
    print "--->$_<--\n";
    {
        local *_;  # disconnects the remaining scope from the implicit 
                   # variables so you can clean up after the dirty api.
                   # NOTE: Submit a bug report against the offending module.
                   #       If you notice this across multiple api features
                   #       consider finding a different module for this task.
        $esp_hosts = $esp->get_hosts(fields => $fields, %filters) || die "$@";
    }
    print "--->$_<--\n";
Eric Strom