tags:

views:

213

answers:

8

I've been programming in Perl for a while, but I never have understood a couple of subtleties about Perl:

The use and the setting/unsetting of the $_ variable confuses me. For instance, why does

# ...
shift @queue;
($item1, @rest) = split /,/;

work, but (at least for me)

# ...
shift @queue;
/some_pattern.*/ or die();

does not seem to work?

Also, I don't understand the difference between iterating through a file using foreach versus while. For instance,I seem to be getting different results for

while(<SOME_FILE>){  
    # Do something involving $_        
}

and

foreach (<SOME_FILE>){
    # Do something involving $_
}

Can anyone explain these subtle differences?

A: 

while only checks if the value is true, for also places the value in $_, except in some circumstances. For example <> will set $_ if used in a while loop.

to get similar behaviour of:

foreach(qw'a b c'){
    # Do something involving $_
}

You have to set $_ explicitly.

while( $_ = shift @{[ qw'a b c' ]} ){  
    # Do something involving $_        
}


It is better to explicitly set your variables

for my $line(<SOME_FILE>){
}

or better yet

while( my $line = <SOME_FILE> ){
}

which will only read in the file one line at a time.


Also shift doesn't set $_ unless you specifically ask it too

$_ = shift @_;

And split works on $_ by default. If used in scalar, or void context will populate @_.

Brad Gilbert
A: 

It is to avoid this sort of confusion that it's considered better form to avoid using the implicit $_ constructions.

my $element = shift @queue;
($item,@rest) = split /,/ , $element;

or

($item,@rest) = split /,/, shift @queue;

likewise

while(my $foo = <SOMEFILE>){

do something 

}

or

foreach my $thing(<FILEHANDLE>){

  do something

}
cms
Considered better form by who? Just because someone might find it confusing is no reason to make such a blanket generalization against a language construct which might prove helpful in other situations.
clintp
It's better form in almost any context where clarity is valued over brevity, and readabililty considered more useful than minor optimisations.
cms
Alex Reynolds
+2  A: 

foreach evaluates the entire list up front. while evaluates the condition to see if its true each pass. while should be considered for incremental operations, foreach only for list sources.

For example:

my $t= time() + 10 ;
while ( $t > time() ) { # do something }
woolstar
+13  A: 
shift @queue;
($item1, @rest) = split /,/;

If I understand you correctly, you seem to think that this shifts off an element from @queue to $_. That is not true.

The value that is shifted off of @queue simply disappears The following split operates on whatever is contained in $_ (which is independent of the shift invocation).

while(<SOME_FILE>){  
    # Do something involving $_        
}

Reading from a filehandle in a while statement is special: It is equivalent to

while ( defined( $_ = readline *SOME_FILE ) ) {

This way, you can process even colossal files line-by-line.

On the other hand,

for(<SOME_FILE>){  
    # Do something involving $_        
}

will first load the entire file as a list of lines into memory. Try a 1GB file and see the difference.

Sinan Ünür
The answer would be improved with some more emphasis on an additional point. The `while(<FILE>)` form does more than check whether `$_` is defined. It also makes the assignment to `$_` in the first place. Other `while` loops do not assign to `$_`: for example, `while(@foo)`.
FM
I'd add that in order to know when `$_` is used implicitly you must RTFM.
Michael Carman
I'd add that naming $_ explicitly will help the next person who reads your code immensely.
Kevin
+1  A: 

StackOverflow: What’s the difference between iterating over a file with foreach or while in Perl?

Alex Reynolds
This would be better as a comment
Brad Gilbert
+3  A: 

Regarding the 2nd question:

while (<FILE>) {
}

and

foreach (<FILE>) {
}

Have the same functional behavior, including setting $_. The difference is that while() evaluates <FILE> in a scalar context, while foreach() evaluates <FILE> in a list context. Consider the difference between:

$x = <FILE>;

and

@x = <FILE>;

In the first case, $x gets the first line of FILE, and in the second case @x gets the entire file. Each entry in @x is a different line in FILE.

So, if FILE is very big, you'll waste memory slurping it all at once using foreach (<FILE>) compared to while (<FILE>). This may or may not be an issue for you.

The place where it really matters is if FILE is a pipe descriptor, as in:

open FILE, "some_shell_program|";

Now foreach(<FILE>) must wait for some_shell_program to complete before it can enter the loop, while while(<FILE>) can read the output of some_shell_program one line at a time and execute in parallel to some_shell_program.

That said, the behavior with regard to $_ remains unchanged between the two forms.

Nathan Fellman
A: 

Please read perldoc perlvar so that you will have an idea of the different variables in Perl.

perldoc perlvar.

Alan Haggai Alavi
+4  A: 

Another, albeit subtle, difference between:

while (<FILE>) {
}

and:

foreach (<FILE>) {
}

is that while() will modify the value of $_ outside of its scope, whereas, foreach() makes $_ local. For example, the following will die:

$_ = "test";
while (<FILE1>) {
    print "$_";
}
die if $_ ne "test";

whereas, this will not:

$_ = "test";
foreach (<FILE1>) {
    print "$_";
}
die if $_ ne "test";

This becomes more important with more complex scripts. Imagine something like:

sub func1() {
    while (<$fh2>) {  # clobbers $_ set from <$fh1> below
        <...>
    }
}

while (<$fh1>) {
    func1();
    <...>
}

Personally, I stay away from using $_ for this reason, in addition to it being less readable, etc.

Michael Krebs
I wasn't aware of *this* difference between `foreach` and `while` with regard to `$_`!
Nathan Fellman