tags:

views:

128

answers:

1

I am trying to do search-and-replace using a regex in Perl.

The text I am searching for is:

<space>Number<space>NumberNumberNumber

and I want to replace it with:

<space>Number<space>NumberNumberNumberI

I have the following regex which works in finding the string:

\s[0-9]\s[0-9[0-9][0-9]

But what do I do about replacing the string? Basically I just want to append an 'I' to the end.

I'm using:

perl -pi -e "s/\s[0-9]\s[0-9][0-9][0-9]/I/;" testFile

but this is replacing the whole thing with I rather than appending to it.

+6  A: 

This is what backreferences are for. Just surround the section of text you want to capture with parentheses. The first set of parentheses are available in $1, the second in $2, and so on.

s/(\s[0-9]\s[0-9]{3})/$1I/

With Perl 5.10 we gained named captures, so you can say

s/(?<bodytext>\s[0-9]\s[0-9]{3})/$+{bodytext}I/

The stuff inbetween < and > is the name. Names become keys in the %+ variable and the values are the captured text.

Another solution is to use a zero-width positive look-behinds

s/(?<=\s[0-9]\s[0-9]{3})/I/

or its, new to Perl 5.10, shorthand \K

s/\s[0-9]\s[0-9]{3}\K/I/


Try

perl -pi -e 's/(\s[0-9]\s[0-9][0-9][0-9])/$1I/' filename

If you use double quotes the $1 is interpolated by the shell before Perl ever sees it. If you have problems with something you think should work, it may be a good idea to take a look at what Perl is seeing. You can do this with B::Deparse:

perl -MO=Deparse -pi -e "s/(\s[0-9]\s[0-9][0-9][0-9])/$1I/" filename

That will produce the following output.

BEGIN { $^I = ""; }
LINE: while (defined($_ = <ARGV>)) {
    s/(\s[0-9]\s[0-9][0-9][0-9])/I/;
}
continue {
    print $_;
}
-e syntax OK

From this we can see that $1 is missing. Lets try again with single quotes:

perl -MO=Deparse -pi -e 's/(\s[0-9]\s[0-9][0-9][0-9])/$1I/' filename
BEGIN { $^I = ""; }
LINE: while (defined($_ = <ARGV>)) {
    s/(\s[0-9]\s[0-9][0-9][0-9])/$1I/;
}
continue {
    print $_;
}
-e syntax OK

And once with escaping:

perl -MO=Deparse -pi -e "s/(\s[0-9]\s[0-9][0-9][0-9])/\$1I/" filename
BEGIN { $^I = ""; }
LINE: while (defined($_ = <ARGV>)) {
    s/(\s[0-9]\s[0-9][0-9][0-9])/$1I/;
}
continue {
    print $_;
}
-e syntax OK
Chas. Owens
that is still replacing the whole thing with I :(
Did you add the parens?
Chas. Owens
yeah this is my complete command perl -pi -e "s/(\s[0-9]\s[0-9][0-9][0-9])/$1I/" testFile.phy
Ah, you may also have a problem because you are using double quotes instead of single quotes, you either need to escape the $ or use single quotes.
Chas. Owens
single quote worked. gotta love perl!
are you a ninja?
Only on the computer.
Chas. Owens
I live like 10 mins from sterling! should do meetups! :) ...probably not the best place to socialize on SO. I'll follow you on twitter
For completeness, you should present the look-behind solution: s/(?<=\s[0-9]\s[0-9][0-9][0-9])/I/ and the new 5.10 \K solution: s/\s[0-9]\s[0-9][0-9][0-9]\K/I/
ysth
@ysth Yeah, I really haven't internalized \K yet. Still writing too much 5.8 code.
Chas. Owens