tags:

views:

386

answers:

4

I have the following use case, input present in the file as:

Line1 : AA BB CC DD EE

I want to replace this with

1 2 3 4 5

Output

Line1: 1 2 3 4 5

In one regular expression in Perl, can I do this

I was trying this but was unsucessful

my @arr1 = ("AA", "BB", "CC", "DD", "EE");
open F2, $file;
my $count = 0;
while (<F2>) {
    my $str = $_;
    $str =~ s/$arr[$count]/$count+1/g;
    print to file
}

close(F2);

This doesn't do the trick any ideas

+2  A: 

If I understand correctly, you want to replace every word with number (incremented by 1 after every word). Here is program with tests:

#!/usr/bin/perl

use strict;
use warnings;
use Test::More qw(no_plan);

sub replace {
  my $str=shift;
  my $count=1;
  $str=~s/\w+/$count++/ge;
  return $str;
}


is(replace('AA AA DD EE'),'1 2 3 4');
is(replace('A B C D E'),'1 2 3 4 5');
Alexandr Ciornii
what does Test::More do here?
Nathan Fellman
Nathan: to test how it works
Alexandr Ciornii
@Nathan: Test::More imports the is() function? ;) But I agree with you, a simple "print replace('AA BB CC DD');" example would have done, the Test::More doesn't add anything.
I do that with Test::More all the time. It doesn't output much when you're right, and outputs a lot when you're wrong. It's a lot easier to see that you messed up.
brian d foy
+2  A: 

You need to do something to modify the file in place, which you are not currently doing. The easiest option would be to use File::Inplace (or to output to a second file).

Additionally you are not looping over the array, but over the lines on the file so it'll replace only $arr[0] for 1 on each line.

  use strict;
  use warnings;
  use File::Inplace;

  my @replacees = ("AA", "BB", "CC", "DD", "EE");
  my $editor = new File::Inplace(file => "file.txt", regex => "\n");
  while (my ($line) = $editor->next_line) {
    my $count = 1
    for my $replacee (@replacees) { 
        if ($line =~ m/$replacee/) {
            $line =~ s/$replacee/$count/g;
        }
        $count = $count + 1;
    }
    $editor->replace_line($line);
  }
  $editor->commit;
Vinko Vrsalovic
+2  A: 

As for writing to the same file, please note Vinko answer. As for replacing strings, please check this snippet:

my @arr1 = ("AA", "BB", "CC", "DD", "EE");
my %replacements = map { ($arr1[$_] => $_ + 1) } (0..$#arr1);
my $regexp = join( '|', sort { length($b) <=> length($a) } @arr1);

open F2, $file;
while (<F2>) {
    my $str = $_;
    $str =~ s/($regexp)/$replacements{$1}/ge;
    print $str;
}
close(F2);

Important parts:

my %replacements = map { ($arr1[$_] => $_ + 1) } (0..$#arr1);

It builds hash with keys from @arr1, and values are the index of given value in @arr1 incremented by 1.

For example, for @arr1 = ("a", "b", "d", "c"); %replacements will be: ("a" => 1, "b", => 2, "c" => 4, "d" => 3);

my $regexp = join( '|', sort { length($b) <=> length($a) } @arr1);

This builds base regexp for finding all words from @arr1. The sort part orders words by their length descending. So, for @arr1 = ("a", "ba", "bac") $regexp will be 'bac|ba|a'.

This ordering is important as otherwise there would be problems if any of the words would be prefix of any other word (as with "ba" and "bac" in my example).

As a last word, usage of filehandles as FH is rather discouraged, as these are globals, and generate "interesting" problems in more complex programs. Instead use open like this:

open my $fh, 'filename';

or better yet:

open my $fh, '<', 'filename';
depesz
A: 

First, a correction:

while (<F2>) {
    my $str = $_;

If you want the line read to end up in $str, there is no reason to involve $_ in the process:

while ( my $str = ) {

which also brings up the point made by depesz that you should use lexical filehandles rather than package global bareword filehandles.

Now, looking at your loop:

my $count = 0;
while (my $str = <$input_fh>) {
    $str =~ s/$arr[$count]/$count+1/g;
    # ...
}

there seems to be an implicit assumption that there cannot be more lines in the file than the number of elements in @foo. In which case, you need not use $count: $. would do just fine. Say you are on the second line. Your code says you want to replace all occurrences of BB on that line with 2 which is different than what you describe verbally.

This is an important point: Any code you post ought to be consistent with the verbal description.

Anyway, here is one way:

rty.pl

#!/usr/bin/perl

use strict;
use warnings;

use File::Slurp;

my ($input) = @ARGV;

write_file(
    $input, [
        map { s/( ([A-Z]) \2 )/ord($2) - ord('A') + 1/gex; $_ } read_file $input
    ]
);
__END__

test.data:

Line1 : AA BB CC DD EE
Line1 : AA BB CC DD EE
Line1 : AA BB CC DD EE
Line1 : AA BB CC DD EE

$ rty.pl test.data

test.data after script invocation:

Line1 : 1 2 3 4 5
Line1 : 1 2 3 4 5
Line1 : 1 2 3 4 5
Line1 : 1 2 3 4 5
Sinan Ünür