views:

305

answers:

4

I quickly jotted off a Perl script that would average a few files with just columns of numbers. It involves reading from an array of filehandles. Here is the script:

#!/usr/local/bin/perl

use strict;
use warnings;

use Symbol;

die "Usage: $0 file1 [file2 ...]\n" unless scalar(@ARGV);

my @fhs;

foreach(@ARGV){
    my $fh = gensym;
    open $fh, $_ or die "Unable to open \"$_\"";
    push(@fhs, $fh);
}

while (scalar(@fhs)){
    my ($result, $n, $a, $i) = (0,0,0,0);
    while ($i <= $#fhs){
        if ($a = <$fhs[$i]>){
            $result += $a;
            $n++;
            $i++;
        }
        else{
            $fhs[$i]->close;
            splice(@fhs,$i,1);
        }
    }
    if ($n){ print $result/$n . "\n"; }
}

This doesn't work. If I debug the script, after I initialize @fhs it looks like this:

  DB<1> x @fhs
0  GLOB(0x10443d80)
   -> *Symbol::GEN0
         FileHandle({*Symbol::GEN0}) => fileno(6)
1  GLOB(0x10443e60)
   -> *Symbol::GEN1
         FileHandle({*Symbol::GEN1}) => fileno(7)

So far, so good. But it fails at the part where I try to read from the file:

  DB<3> x $fhs[$i]
0  GLOB(0x10443d80)
   -> *Symbol::GEN0
         FileHandle({*Symbol::GEN0}) => fileno(6)
  DB<4> x $a
0  'GLOB(0x10443d80)'

$a is filled with this string rather than something read from the glob. What have I done wrong?

+7  A: 

You can only use a simple scalar variable inside <> to read from a filehandle. <$foo> works. <$foo[0]> does not read from a filehandle; it's actually equivalent to glob($foo[0]). You'll have to use the readline builtin, a temporary variable, or use IO::File and OO notation.

$text = readline($foo[0]);
# or
my $fh = $foo[0];  $text = <$fh>;
# or
$text = $foo[0]->getline;  # If using IO::File

If you weren't deleting elements from the array inside the loop, you could easily use a temporary variable by changing your while loop to a foreach loop.

Personally, I think using gensym to create filehandles is an ugly hack. You should either use IO::File, or pass an undefined variable to open (which requires at least Perl 5.6.0, but that's almost 10 years old now). (Just say my $fh; instead of my $fh = gensym;, and Perl will automatically create a new filehandle and store it in $fh when you call open.)

cjm
Or the equivalent to `<HANDLE>`, which is spelled out as `readline HANDLE`.
Randal Schwartz
+1  A: 

I have trouble understanding your logic. Do you want to read several files, which just contains numbers (one number per line) and print its average?

use strict;
use warnings;

my @fh;
foreach my $f (@ARGV) {
    open(my $fh, '<', $f) or die "Cannot open $f: $!";
    push @fh, $fh;
}

foreach my $fh (@fh) {
    my ($sum, $n) = (0, 0);
    while (<$fh>) {
        $sum += $_;
        $n++;
    }
    print "$sum / $n: ", $sum / $n, "\n" if $n;
}
Leonardo Herrera
The issue is that the files are not guaranteed to have the same number of rows.
pythonic metaphor
Why is that an issue?
Leonardo Herrera
+2  A: 

If you are willing to use a bit of magic, you can do this very simply:

use strict;
use warnings;

die "Usage: $0 file1 [file2 ...]\n" unless @ARGV;

my $sum   = 0;

# The current filehandle is aliased to ARGV
while (<>) {
    $sum += $_;
} 
continue {
    # We have finished a file:
    if( eof ARGV ) {
        # $. is the current line number.
        print $sum/$. , "\n" if $.;
        $sum = 0;

        # Closing ARGV resets $. because ARGV is 
        # implicitly reopened for the next file.
        close ARGV;  
    }
}

Unless you are using a very old perl, the messing about with gensym is not necessary. IIRC, perl 5.6 and newer are happy with normal lexical handles: open my $fh, '<', 'foo';

daotoad
I like. But what's `$count` for?
ephemient
Also, `unless` will implicitly use `@ARGV` in scalar context... and I wouldn't count this as magic, this is how I'd write it too :)
ephemient
Good points. The `unless scalar` is left over from pasting the OP's code. The $count was left over from before I realized I could use the line number.
daotoad
+1  A: 

Seems like a for loop would work better for you, where you could actually use the standard read (iteration) operator.

for my $fh ( @fhs ) { 
    while ( defined( my $line = <$fh> )) {
        # since we're reading integers we test for *defined*
        # so we don't close the file on '0'
        #...
    }
    close $fh;
}

It doesn't look like you want to shortcut the loop at all. Therefore, while seems to be the wrong loop idiom.

Axeman