tags:

views:

100

answers:

4

Some of the environmnet variables which we use in Unix are as below (just an example):

VAR1=variable1
VAR2=variable2
VAR3=variable3
# and so on

Now, I have a perl script (let's call it test.pl) which reads a tab delimited text file (let's call it test.txt) and pushes the contents of it columnwise in separate arays. The first column of test.txt contains the following information for example (the strings in first column are delimited by / but I do not know how may / would a string contain and at what postion the environment variable would appear):

$VAR1/$VAR2/$VAR3
$VAR3/some_string/SOME_OTHER_STRING/and_so_on/$VAR2
$VAR2/$VAR1/some_string/some_string_2/some_string_3/some_string_n/$VAR2

The extract of the script is as below:

use strict;
my $input0=shift or die "must provide test.txt as the argument 0\n";
open(IN0,"<",$input0) || die "Cannot open $input0 for reading: $!";
my @first_column;
while(<IN0>)
{
   chomp;
   my @cols=split(/\t/);
   my $first_col=`eval $cols[0]`; #### but this does not work
   # here goes the push stmt to populate the array
   ### more code here
}
close(IN0);

Question: How can I access evnironment variables in such a situation so that the array is populated as below:

$first_column[0]=variable1/vraible2/variable3
$first_column[1]=variable3/some_string/SOME_OTHER_STRING/and_so_on/variable2
$first_column[2]=variable2/variable1/some_string/some_string_2/some_string_3/some_string_n/variable2
A: 

Perl keeps its environment variables in %ENV, in your case you can change your code like so:

my $first_col = $ENV[$cols[0]];
Hasturkun
I think this would not work because exact contains of $cols[0] are not available in %ENV. Correct me if I am wrong.
sachin
@sachin: Yes, there might be a more functional and more elegant solution using a regex, but please tell us what the 'exact contents' would be. Why wouldn't they be available in `%ENV`?
MvanGeest
@MvanGeest `%ENV` would have `$ENV{VAR1}` etc. Not `$ENV{$VAR1/$VAR2/$VAR3}`.
Sinan Ünür
+4  A: 

I think you are looking for a way to process configuration files. I like Config::Std for that purpose although there are many others on CPAN.


Here is a way of processing just the contents of $cols[0] to show in an explicit way what you need to do with it:

#!/usr/bin/perl

use strict; use warnings;

# You should not type this. I am assuming the
# environment variables are defined in the environment.
# They are here for testing.
@ENV{qw(VAR1 VAR2 VAR3)} = qw(variable1 variable2 variable3);

while ( my $line = <DATA> ) {
    last unless $line =~ /\S/;
    chomp $line;
    my @components = split qr{/}, $line;
    for my $c ( @components ) {
        if ( my ($var) = $c =~ m{^\$(\w+)\z} ) {
            if ( exists $ENV{$var} ) {
                $c = $ENV{$var};
            }
        }
    }
    print join('/', @components), "\n";
}

__DATA__
$VAR1/$VAR2/$VAR3
$VAR3/some_string/SOME_OTHER_STRING/and_so_on/$VAR2
$VAR2/$VAR1/some_string/some_string_2/some_string_3/some_string_n/$VAR2

Instead of the split/join, you can use s/// to replace patterns that look like variables with the corresponding values in %ENV. For illustration, I put a second column in the __DATA__ section which is supposed to stand for a description of the path, and turned each line in to a hashref. Note, I factored out the actual substitution to eval_path so you can try alternatives without messing with the main loop:

#!/usr/bin/perl

use strict; use warnings;

# You should not type this. I am assuming the
# environment variables are defined in the environment.
# They are here for testing.
@ENV{qw(VAR1 VAR2 VAR3)} = qw(variable1 variable2 variable3);

my @config;
while ( my $config = <DATA> ) {
    last unless $config =~ /\S/;
    chomp $config;
    my @cols = split /\t/, $config;
    $cols[0] = eval_path( $cols[0] );
    push @config, { $cols[1] => $cols[0] };
}

use YAML;
print Dump \@config;

sub eval_path {
    my ($path) = @_;
    $path =~ s{\$(\w+)}{ exists $ENV{$1} ? $ENV{$1} : $1 }ge;
    return $path;
}

__DATA__
$VAR1/$VAR2/$VAR3   Home sweet home
$VAR3/some_string/SOME_OTHER_STRING/and_so_on/$VAR2 Man oh man
$VAR2/$VAR1/some_string/some_string_2/some_string_3/some_string_n/$VAR2 Can't think of any other witty remarks ;-)

Output:

---
- Home sweet home: variable1/variable2/variable3
- Man oh man: variable3/some_string/SOME_OTHER_STRING/and_so_on/variable2
- Can't think of any other witty remarks ;-): variable2/variable1/some_string/some_string_2/some_string_3/some_string_n/variable2
Sinan Ünür
I agree, just use a Config module rather than reinventing the wheel... IMHO the difference between a beginner programmer and an intermediate programmer is the intermediate programmer starts to have the mindset *"there is **no way** my problem has never been encountered by someone else before... I bet there is a library that already solves this for me."* The beginner just forges ahead thinking they are in uncharted territory.
Ether
Thanks Sinan. It was very helpful
sachin
+1  A: 

If you want to allow for full shell expansions, one option to use the shell to do the expansion for you, perhaps via echo:

$ cat input
$FOO
bar
${FOO//cat/dog}
$ FOO=cat perl -wpe '$_ = qx"echo $_"' input
cat
bar
dog

If you cannot trust the contents of the environment variable, this introduces a security risk, as invoking qx on a string may cause the shell to invoke commands embedded in the string. As a result, this scriptlet will not run under taint mode (-T).

William Pursell
+1  A: 

I think you just want to do this:

my @cols = map { s/(\$(\w+))/ $ENV{$2} || $1 /ge; $_ } split /\t/;

What you would do here is after you split them you would take each sequence of '$' followed by word characters and check to see if there was an environment variable for the word portion of it, otherwise leave it as is.

  • The e switch on a substitution allows you to execute code for the replacement value.
  • If you expect a '0' for any environment variable value, it's better off to do a defined or, that came in with 5.10.

    my @cols = map { s|(\$(\w+))| $ENV{$2} // $1 |ge; $_ } split /\t/;
    

(Ignore the markup. // is a defined-or, not a C-comment)

Axeman
Really cool to get it done via Regex
sachin