views:

199

answers:

3

If I have a function that might be passed a file name or various file handles or typeglobs, how can the function distinguish among these arguments -- including telling the difference, for example, between *DATA and *STDIN?

Updated code, based on answers received so far Thanks, everyone.

use strict;
use warnings;
use FileHandle;

sub file_thing_type {
    my ($f) = shift;
    my $type;
    my $r = ref $f;
    if ($r eq 'GLOB' or ref(\$f) eq 'GLOB'){
        # Regular and built-in file handles.
        my $fn = fileno $f;
        if (defined $fn){
            my %built_in = (
                'STDIN'  => fileno(*STDIN),
                'STDOUT' => fileno(*STDOUT),
                'STDERR' => fileno(*STDERR),
                'DATA'   => fileno(*DATA),
            );
            for my $k (keys %built_in){
                if (defined $built_in{$k} and $built_in{$k} == $fn){
                    $type = $k;
                    last;
                }
            }
            $type = 'regular file handle' unless defined $type;
        }
        else {
            $type = 'non-IO glob';
        }
    }
    elsif ($r){
        # A reference of some kind.
        $type = $r;
        # Might be an IO object. Has it been opened?
        {
            no warnings 'unopened';
            $type .= ' opened' if -f $f;
        }
    }
    else {
        # File name or just some other value?
        $type = -f $f ? 'file name' : 'other';
    }
    return $type;
}

open(my $h, '<', $0) or die $!;

printf "%12s => %s\n",
       $_->[0],
       file_thing_type($_->[1])
for (
    [ 'handle',     $h                  ], # regular file handle
    [ 'DATA',       *DATA               ], # DATA if source has DATA section; else non-IO glob
    [ 'STDIN',      *STDIN              ], # STDIN
    [ 'STDOUT',     *STDOUT             ], # STDOUT
    [ 'STDERR',     *STDERR             ], # STDERR
    [ 'FOO',        *FOO, *FOO          ], # non-IO glob
    [ 'FileHandle', FileHandle->new     ], # FileHandle
    [ 'FileHandle', FileHandle->new($0) ], # FileHandle opened
    [ 'file name',  $0                  ], # file name
    [ 'not file',   ''                  ], # other
    [ 'misc',       {bar=>1}            ], # HASH
);

__END__
+2  A: 

Update: The problem of distinguishing between a variable that might be assigned to the *DATA or *STDIN globs is a job for fileno:

sub data_or_stdin {
  my $x = shift;
  if (fileno($x) == fileno(DATA)) {
    return "DATA";
  } elsif (fileno($x) == fileno(STDIN)) {
    return "STDIN";
  } else {
    return "NEITHER";
  }
}

print "DATA:  ", data_or_stdin(*DATA), "\n";
print "STDIN: ", data_or_stdin(*STDIN), "\n";
open(ZZZ, ">>", "zzz"); close ZZZ;
open(ZZZ, "<", "zzz"); print "ZZZ: ", data_or_stdin(*ZZZ), "\n"; close ZZZ;
open($fh, "<", "zzz"); print "\$fh=ZZZ: ", data_or_stdin($fh), "\n"; close $fh;
$fh = *DATA; print "\$fh=DATA: ", data_or_stdin($fh), "\n";
$fh = *STDIN; print "\$fh=STDIN: ", data_or_stdin($fh), "\n";

__END__
stuff;
$ perl data_or_stdin.pl
DATA:  DATA
STDIN: DATA
ZZZ: NEITHER
$fh=ZZZ: NEITHER
$fh=DATA: DATA
$fh=STDIN: DATA

If $f is a filehandle, then either ref $f or ref \$f will be "GLOB" If $f is a scalar, then ref \$f will be "SCALAR".

sub filehandle_or_scalar {
  my $x = shift;
  if (ref $x eq "GLOB" || ref \$x eq "GLOB") {
      return "filehandle";
  } elsif (ref \$x eq "SCALAR") {
      return "scalar";
  } else {
      return "not filehandle or scalar";
  }
}

print "STDIN: ", filehandle_or_scalar(*STDIN), "\n";
print "\$_: ", filehandle_or_scalar($_), "\n";
open($fh, ">", "zzz");
print "\$fh: ", filehandle_or_scalar($fh), "\n";
print "string: ", filehandle_or_scalar('file.txt'), "\n";
print "ref: ", filehandle_or_scalar(\$x), "\n"

###########################################

$ perl filehandle_or_scalar.pl
STDIN: filehandle
$_: scalar
$fh: filehandle
string: scalar
ref: not filehandle or scalar
mobrule
sub is_filehandle {should besub filehandle_or_scalar {
Gavin Brock
thanks, Gavin .
mobrule
+1  A: 

You could use pattern matching on the stringafied filehandles for *STDIN, *DATA, etc...

if ($f =~ /\bSTDIN$/) {
    return "STDIN";
} elsif ($f =~ /\bDATA$/) {
    return "DATA";
}

Hacky, but may be enough...

Gavin Brock
+1  A: 

mobrule's approach looks promising:

perl -E 'open $fh, "<", "/dev/null"; say ref $fh;'

will output GLOB. However, so will

perl -E 'say ref \*FOO;'

A "real" filehandle will also have a file descriptor associated with it which you can determine using fileno:

perl -MData::Dumper -E 'open $fh, "<", "/dev/null"; say Data::Dumper::Dumper([fileno $fh, fileno \*STDIN, fileno \*FOO])'

will output something like:

$VAR1 = [
          3,
          0,
          undef
        ];

You can use this to tell a GLOB that is being used for file I/O from one that isn't. On UNIX systems, the standard input stream is associated with the file descriptor 0 by convention.

Another thing that comes to mind is a class that is tied to a filehandle. These need to implement a particular interface which you can test for using can. See the tie VARIABLE,CLASSNAME,LIST entry in perlfunc for details about this interface.

hillu