views:

467

answers:

3

I'm not necessarily looking for a better way to do this, rather an explanations of the output would greatly be appreciated. Recently, a senior programmer asked me why his code worked but only for one instance. What I came to find out was that it worked every other occurrence. Here is my example:

#!/usr/bin/perl -w
use strict;

my @list_env_vars = (
    '$SERVER',
    '$SERVER',
    '$SERVER',
    '$SERVER',
    '$SERVER',
    '$SERVER',
);

foreach (@list_env_vars){
    print "$_ = ".glob()."\n";
}

which output for perl 5.004:

$SERVER = UNIX_SERVER
$SERVER =
$SERVER = UNIX_SERVER
$SERVER =
$SERVER = UNIX_SERVER
$SERVER =

or output for perl 5.10:

$SITE = $SITE
Use of uninitialized value in concatenation (.) or string at glob_test.pl line 14.
$SITE =
$SITE = $SITE
Use of uninitialized value in concatenation (.) or string at glob_test.pl line 14.
$SITE =
$SITE = $SITE
Use of uninitialized value in concatenation (.) or string at glob_test.pl line 14.
$SITE =

I personally have never used glob() in this fashion so I was ill equipped to answer him. I read through perldoc glob documentation and followed the File::Glob link on that page and still couldn’t find anything that would explain the output. Any help would be much appreciated.

+12  A: 

glob in scalar context:

In scalar context, glob iterates through such filename expansions, returning undef when the list is exhausted.

In

foreach (@list_env_vars){
    print "$_ = ".glob()."\n";
}

The glob() there really is glob($_). Every iteration, $_ contains the string $SERVER. Given that the environment variable does not change, $SERVER is expanded to the same string. First time, this string is returned. Next, the list is exhausted, so undef is returned. Third time, we start over. ...

Clarification: It does not matter that the argument to the second call is the same as the one for the first call since there is no way to reset glob's iterator.

You can see this more clearly using the following example (current directory contains files '1.a', 1.b', '2.a' and '2.b'):

#!/usr/bin/perl -w
use strict;

my @patterns = (
    '*.a',
    '*.b',
);

for my $v ( @patterns ) {
    print "$v = ", scalar glob($v), "\n";
}

Output:

C:\Temp> d
*.a = 1.a
*.b = 2.a

I would recommend accessing environment variables via the %ENV hash:

my @list_env_vars = ($ENV{SERVER}) x 6;

or

my @list_env_vars = @ENV{qw(HOME TEMP SERVER)};
Sinan Ünür
I'm still having trouble understanding why it works every other occurrence, but thanks for your response.
Akers
@Akers you give `glob` a an expression that expands to a single unique string. First, it returns you that, next it returns `undef`. Just as the documentation states.
Sinan Ünür
'Does this person not know that he can access value of the environment variable via $ENV{SERVER}? How about'yeah actually, I suggested using $ENV{}, I was just curious how it ever worked I've never used glob in that fashion before. I just used multiple version of the same env var to prove my point. It doesn't matter if you use 6 different variables you still get the same output. Every other variable prints.
Akers
@Akers there is no way to tell glob to reset its iterator. So, subsequent calls with different values won't matter until it has exhausted the matches for the first call.
Sinan Ünür
That my friend is what I was looking for, thank you very much!
Akers
This is a really good explanation, I will probably send my Senior programmer a link to this page, its such a good explanation. I wish i could double up vote it I like it so much so I up voted you comments as well.
Akers
Thank you. In the mean time, I found the following discussion which might be useful: http://groups.google.com/group/perl.perl5.porters/browse_thread/thread/5401ccdcbd6d4fa1/
Sinan Ünür
+4  A: 

Incidentally, the reason why in 5.004 you get a variable expansion, while on 5.10 you just get your literal string back, is because on old perl, glob() was carried out by the system shell, which just as a side-effect performs variable expansion. Since perl 5.6, glob() uses the File::Glob module which does the work itself, without the shell, and doesn't expand environment variables (which glob was never intended to do). %ENV is the proper way to get at the environment.

hobbs
thanks, this was something i suspected about the older version of glob, but wasn't sure where to look it up for verification.
Akers
A: 

Notes on the old behavior, wiki'd for your convenience (and so that I have the full range of markup and no 500-char limit):

The fact that glob and <*globbything*> changed in 5.6 is mentioned in passing in the docs (perl56delta, perlop, -f glob) but the only real source on exactly how it used to work is a pre-5.6 version of perlop. Here's the relevant bit from 5.005:

Example:

while (<*.c>) {
    chmod 0644, $_;
}

is equivalent to

open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|");
while (<FOO>) {
    chop;
    chmod 0644, $_;
}

In fact, it's currently implemented that way. (Which means it will not work on filenames with spaces in them unless you have csh(1) on your machine.)

Heh, that's pretty evil stuff. Anyway, if you ever find yourself wanting to consult old perldocs like that, just go to search.cpan.org, pull up the perl distribution, use the pulldown list to select an old version, then click through to the doc that you need. perl itself isn't really subject to getting "tidied" off of CPAN; currently everything from 5.004 on up is available without hitting BackPan.

hobbs