views:

258

answers:

6

I have a string:

$mystring = "My cat likes to eat tomatoes.";

I want to do two replacements on this string with regex. I want to do s/cat/dog/ and s/tomatoes/pasta/. However, I don't know how to properly format the regular expression to do the multiple replacements in one expression, on one line, in one declaration. Right now, all I have is:

$mystring =~ s/cat/dog/ig;
$mystring =~ s/tomatoes/pasta/ig;
+6  A: 

Why would you want to?

I know some Perl-ers pride themselves for being able to write some of the most obfuscated code imaginable (see some of the code-golf questions on here), but that doesn't make it a smart thing to do.

Keep it readable, and just keep it like this you'll thank yourself in the long run.

EDIT:

Certainly, if you are looking at 5 or more replacements, please (for the mother of god) use some kind of lookup table. But DO NOT try to write one massive regex that does it all.

NomeN
So, if I have to do multiple replacements (perhaps in the hundreds) there's no way to do those hundred replacements in one regex expression, therefore, I have to keep redefining the string for each replacement? :/
Rezn0r
I don't know if there is a way, I'm saying you shouldn't. If you have many differnt replacements Bart's suggestion is far more readable and easier to change in the future than just one massive regex.
NomeN
If you have hundreds of replacements, use Sinan Ünür's answer. But if you have only two, keep it simple instead.
Dave Hinton
Sinan's pretty good too. (on face value, as I'm not a Perl-magician) both +1'ed. You'll need to look into these kinds of solutions, separating the patterns from the matching, consequently keeping both easy to understand.
NomeN
Certainly a concise lookup table is more maintainable than a many many lines of repetitive `s///` invocations.
Sinan Ünür
That entirely depends upon the amount of replacements, but if there are 5 or more I'd say yes, create a lookup table.
NomeN
@NomeN Agreed. I figured the OP's example was simplified.
Sinan Ünür
+1  A: 

One very rudimentary way to perform multiple substitutions in a single line would be to match the text with groupings. This will not allow you to find all instances of "cat" and replace it with "dog", but it will get you to "My dog likes to eat pasta"

$mystring =~ s/(.*)cat(.*)tomatoes(.*)/$1dog$2pasta$3/g;
akf
+14  A: 

As usual, use a hash as a lookup table, match keys, replace with values:

#!/usr/bin/perl

use strict;
use warnings;

use Regex::PreSuf;

my %repl = (
    cat => 'dog',
    tomatoes => 'pasta',
);

my $string = "My cat likes to eat tomatoes.";
my $re = presuf( keys %repl );

$string =~ s/($re)/$repl{$1}/ig;

print $string, "\n";

Output:

C:\Temp> t
My dog likes to eat pasta.

You could also use a loop:

for my $k ( keys %repl ) {
    $string =~ s/\Q$k/$repl{$k}/ig;
}
Sinan Ünür
+1 once again nice and clean. I really like that...
bastianneu
Stupid name, but a nice module anyway.
innaM
@Manni yeah, I still have a hard time remembering the name of the module.
Sinan Ünür
+3  A: 

If the things you're looking for are regular expressions themselves, a direct lookup table as perl @Sinan Ünür won't work (as the string equality 123 eq '\d+' fails).

You can use Regexp::Assemble to get around this limitation:

use strict;
use warnings;
use Regexp::Assemble;

my %replace = (
    'cat' => 'dog',
    '(?:tom|pot)atoes' => 'pasta',
);
my $re = Regexp::Assemble->new->track(1)->add(keys %replace);

my $str = 'My cat likes to eat tomatoes.';
while (my $m = $re->match($str)) {
    $str =~ s/$m/$replace{$m}/;
}
print $str, $/;

$str = 'My cat likes to eat potatoes.';
while (my $m = $re->match($str)) {
    $str =~ s/$m/$replace{$m}/;
}
print $str, $/;

Both of these blocks produces My dog likes to eat pasta.

dland
+1  A: 

You can do this the quick and dirty way, or the quick and clean way:

In both cases you need a hash word => replacement

With the quick and dirty way, you then build the left part of the substitution by joining the keys of the hash with a '|'. In order to deal with overlapping words (eg 'cat' and 'catogan') you need to place the longest option first, by doing a sort reverse on the keys of the hash. You still can't deal with meta-characters in the words to replace (eg 'cat++').

The quick and clean way uses Regexp::Assemble to build the left part of the regexp. It deals natively with overlapping words, and it is simple to get it to deal with meta-characters in the words to replace.

Once you have the word to replace, you then replace it with the corresponding entry in the hash.

Below is a bit of code that shows the 2 methods, dealing with various cases:

#!/usr/bin/perl

use strict;
use warnings;

use Test::More tests => 6;

use Regexp::Assemble;

my $mystring = "My cat likes to eat tomatoes.";
my $expected = "My dog likes to eat pasta.";

my $repl;

# simple case
$repl= { 'cat' => 'dog', 'tomatoes' => 'pasta', };

is( 
    repl_simple($mystring, $repl), 
    $expected, 
    'look Ma, no module (simple)'
);  

my $re= regexp_assemble($repl);
is( 
    repl_assemble($mystring, $re), 
    $expected, 
    'with Regex::Assemble (simple)'
);

# words overlap
$mystring = "My cat (catogan) likes to eat tomatoes.";
$expected = "My dog (doggie) likes to eat pasta.";

$repl= {'cat' => 'dog', 'tomatoes' => 'pasta', 'catogan'  => 'doggie', };

is( 
    repl_simple($mystring, $repl), 
    $expected, 
    'no module, words overlap'
);  

$re= regexp_assemble( $repl);
is( 
     repl_assemble($mystring, $re), 
     $expected, 
     'with Regex::Assemble, words overlap'
);


# words to replace include meta-characters
$mystring = "My cat (felines++) likes to eat tomatoes.";
$expected = "My dog (wolves--) likes to eat pasta.";

$repl= {'cat' => 'dog', 'tomatoes' => 'pasta', 'felines++' => 'wolves--', };

is( 
    repl_simple($mystring, $repl), 
    $expected, 
    'no module, meta-characters in expression'
);  

$re= regexp_assemble( $repl);
is( 
    repl_assemble($mystring, $re), 
    $expected, 
    'with Regex::Assemble, meta-characters in expression'
);

sub repl_simple { 
    my( $string, $repl)= @_;
    my $alternative= join( '|', reverse sort keys %$repl);
    $string=~ s{($alternative)}{$repl->{$1}}ig;
    return $string;
  }


sub regexp_assemble { 
    my( $repl)= @_;
    my $ra = Regexp::Assemble->new;
    foreach my $alt (keys %$repl)
      { $ra->add( '\Q' . $alt . '\E'); }
    return $ra->re;
  } 

sub repl_assemble { 
    my( $string, $re)= @_;
    $string=~ s{($re)}{$repl->{$1}}ig;
    return $string;
  }
mirod
+2  A: 

My suggestion is you do this

my $text               =  'My cat likes to eat tomatoes.';
my ( $format = $text ) =~ s/\b(cat|tomatoes)\b/%s/g;

Then you can just do this:

my $new_sentence = sprintf( $format, 'dog', 'pasta' );

As well as this:

$new_sentence    = sprintf( $format, 'tiger', 'asparagus' );

I go with the others. You shouldn't want to do it all in one expression, or one line...but here is a way:

$text =~ s/\b(cat|tomatoes)\b/ ${{ qw<cat dog tomatoes pasta> }}{$1} /ge;
Axeman
This only works if you know which value is first ahead of time.
Brad Gilbert
You have just provided the OP with the sword to commit harakiri with, I hope you can live with your consience ;-).
NomeN