views:

835

answers:

5

When I read through Programming Perl, 2nd Edition, Page 51, something confuses me :

sub newopen {
    my $path = shift;
    local *FH;    #not my!
    open (FH, $path) || return undef;
    return *FH;
}

$fh = newopen('/etc/passwd');

My I know, why we are not recommenced to use my? So far, I cannot see anything will go wrong if we use my().

Thanks!

+11  A: 

Why are you reading an out-of-date book. The 3rd Edition has been out for a long time! Which version of Perl are you using? The 2nd Edition describes Perl 5.004 (5.4.x) or thereabouts.

These days, you should not use the typeglob notation for file handles; use the 'lexical file handles' (see open, I think) or the FileHandle module, or one of its relatives instead.


Thanks to Michael Schwern and Ysth for comments incorporated here.

Jonathan Leffler
In real world, you do need to deal with a lot of legacy code.
Yan Cheng CHEOK
The 2nd edition is 5.004ish, I think (5.4.x in modern parlance).
ysth
@Yan In the real world, few people are using anything older than 5.6. @Jonathan FileHandle is overkill for just opening and reading a file. Lexical filehandles++
Schwern
A: 

I believe it's because my allocates a new copy of the variable on the stack, and it's lost when you exit the block. local saves the existing *FH elsewhere and overrides the existing *FH. It restores the old one when you exit the stack. With my the *FH typeglob goes out of scope when you exit the block. With local it keeps on existing, so you can keep on using it after you return it.

I'm not 100% sure of this, but maybe it can point you in the right direction.

Nathan Fellman
You're partially right, partially wrong, and very fuzzy. :D The full explanation is far to long to fit in a comment so I added a new answer (and even that's an abridged version).
Michael Carman
A: 

See Localized Filehandles here, I guess that explains it.

zoul
+18  A: 

In your sample code, the call to the built in subroutine open is using a bare word as the file handle, which is the equivalent of a global variable. As Nathan Fellman's answer explained, using local will localize this bare word to the current code block, in the event that another global variable with the same name is defined elsewhere in the script or module. This will prevent the previously defined global variable from being wiped out by the new declaration.

This was a very common practice in the old Perl days, but as of Perl 5.6 it is far better to use a scalar (with the my declaration that you hinted to in your question) to define your file handle and, additionally, use the three argument call to open.

use Carp;
open my $error_log, '>>', 'error.log' or croak "Can't open error.log: $OS_ERROR";

As an aside, please note that for standard input/output reading and writing, it is still better to use the two argument open:

use Carp;
open my $stdin, '<-' or croak "Can't open stdin: $OS_ERROR";

Alternatively, you can use the IO::File module to bless the file handle to the class:

use IO::File;
my $error_log = IO::File->new('error.log', '>>') or croak "Can't open error.log: $OS_ERROR");

The majority of credit here goes to Damian Conway, author of the excellent book Perl Best Practices. If you are serious about Perl development, you owe it to yourself to purchase this book.

cowgod
+1. Basically, pre-5.6, filehandles weren't quite first-class values in Perl, necessitating all kinds of disgusting hacks to pass them in and out of functions. Also *please* use the new 3-arg open() syntax -- otherwise unusual filenames will cause all sorts of mayhem.
j_random_hacker
Hopefully it's obvious that my pleading is directed toward the asker :)
j_random_hacker
+15  A: 

The trite answer is that you have to use local because my *FH is a syntax error.

The "right" (but not very enlightening) answer is that you're doing it wrong. You should be using lexical filehandles and the three-argument form of open instead.

sub newopen {
    my $path = shift;
    my $fh;
    open($fh, '<', $path) or do {
        warn "Can't read file '$path' [$!]\n";
        return;
    }
    return $fh;
}

To really answer why requires an explanation of the difference between lexical and global variables and between a variable's scope and its duration.

A variable's scope is the portion of the program where its name is valid. Scope is a static property. A variable's duration, on the other hand, is a dynamic property. Duration is the time during a program's execution that the variable exists and holds a value.

my declares a lexical variable. Lexical variables have a scope from the point of declaration to the end of the enclosing block (or file). You can have other variables with the same name in different scopes without conflict. (You can also re-use a name in overlapping scopes, but don't do that.) The duration of lexical variables is managed thorugh reference counting. So long as there is at least one reference to a variable the value exists, even if the name isn't valid within a particular scope! my also has a runtime effect -- it allocates a new variable with the given name.

local is a bit different. It operates on global variables. Global variables have a global scope (the name is valid everywhere) and a duration of the entire life of the program. What local does is make a temporary change to the value of a global variable. This is sometimes referred to as "dynamic scoping." The change starts at the point of the local declaration and persists until the end of the enclosing block after which the old value is restored. It's important to note that the new value is not restricted to the block -- it is visible everywhere (including called subroutines). Reference counting rules still apply, so you can take and keep a reference to a localized value after the change has expired.

Back to the example: *FH is a global variable. More accurately it's a "typeglob" -- a container for a set of global variables. A typeglob contains a slot for each of the basic variable types (scalar, array, hash) plus a few other things. Historically, Perl used typeglobs for storing filehandles and local-izing them helped ensure that they didn't clobber each other. Lexical variables don't have typeglobs which is why saying my *FH is a syntax error.

In modern versions of Perl lexical variables can and should be used as filehandles instead. And that brings us back to the "right" answer.

Michael Carman
Thanks. I like this one "*FH is a global variable. More accurately it's a "typeglob" -- a container for a set of global variables." and it make me understand why local() is needed.
Yan Cheng CHEOK
Well, you have to use local because a typeglob like *FH is a symbol table thing, and lexical variables don't deal with symbol tables. I think it confuses things to talk about scope to explain it.
brian d foy