views:

515

answers:

3

When I issue a Perl script some standard input, e.g.

$ awk '{print $1"\t"$2}' foo.txt | myScript.pl

I have a script which contains a bug. It reads the first line of standard input, gets something from that first line, and then only parses lines 2 through n of standard input on subsequent read:

open (FH, "< $input") or die $?;
my $firstLine = <FH>; // reads first line

...

while (my $line = <FH>) {
    // reads lines 2 through n
}
close (FH);

So I added a seek statement to this script to try to reset the file handle to the beginning of the file:

use Fcntl qw(:seek);

...

open (FH, "< $input") or die $?;
my $firstLine = <FH>; // reads first line
seek (FH, 0, SEEK_SET) or die "error: could not reset file handle\n"; // should reset file handle

...

while (my $line = <FH>) {
    // reads lines 1 through n (in theory)
}
close (FH);

Adding this seek statement works for file input, but not for standard input. The script crashes with the error:

 error: could not reset file handle

How can I correctly read the first line of standard input at the start of the script, reset the file handle and then read all lines 1 through n of the standard input?

I guess I can write a special case for storing and processing the first line before the while loop, but I'm hoping there is a cleaner solution to this problem that allows me to handle both standard input and file input.

Thanks for your advice.

+2  A: 

EDIT: Drawback of my first approach is that it must load all lines from FH into memory before processing the list. This approach, using the open to scalar reference feature (available since 5.8.0) won't have that problem:

my $firstline = <FH>;
open(my $f1, '<', \$firstline);

...

while (my $line = <$f1> || <FH>) {
    # process line 1 through n
}


How about:

for my $line ($firstline, <FH>) {
    # process lines 1 through n
}
mobrule
I like using `while` because I deal with very large files. Using `for` requires, as you've noted, storing all the input data in memory before processing it.
Alex Reynolds
+1  A: 

I don't think you can reset (or seek(0)) for STDIN. It isn't a normal file handle as STDIN. Since it isn't actually a file, you would require the STDIN does resettable buffering.

I think you'll need to handle the reading and re-using of line 1 specially.

mobiGeek
The issue isn't that it's STDIN, it's that it isn't a file. seek will fail on other filehandles that are pipes or other non-seekable types; seek will succeed even on STDIN if it is a file, for example: `seq 9 >foo; perl -we'seek STDIN, 12, 0; print scalar <>' <foo` prints 7, not 1.
ysth
+7  A: 

There's an easy way:

open (FH, "< $input") or die $?;
my $line = <FH>; // reads first line

//do stuff with first line

do {
    //stuff
} while ($line = <FH>);
Anon.
Of course! I completely forgot about `do-while`. Very elegant, thank you.
Alex Reynolds
Just remember that next/last don't see the do{} as a block; see http://perldoc.perl.org/perlsyn.html#Statement-Modifiers for the standard workarounds.
ysth