tags:

views:

676

answers:

5

hi,

I have been trying to remove the text before and after a particular character in each line of a text. It would be very hard to do manually since it contain 5000 lines and I need to remove text before that keyword in each line. Any software that could do it, would be great or any Perl scripts that could run on Windows. I run Perl scripts in ActivePerl, so scripts that could do this and run on ActivePerl would be helpful.

Thanks

+2  A: 

You don't need software, you can make this part of your existing script. Multiline regex replace along the lines of /a(b)c/ then you can backref b in the replacer with $1. Without knowing more about the text you're working with it's hard to guess what the actual pattern would be.

annakata
+1  A: 

I'd say, that if $text contains your whole text, you can do :

$text =~ s/^.*(keyword1|keyword2).*$/$1/m;

The m modifier makes ^ and $ see a beginning and an ending of line, and not the beginning and ending of the string.

mat
I think this one won't work, because the `.*$` won't match newlines.
Leon Timmermans
Of course it won't match the newlines, that's the point of '/m', wasn't that what was asked for ?
mat
. will match anything but a newline. $ will match after a newline or at the end of a string. If there is a newline between them (as will usually be the case), it won't match.
Leon Timmermans
If you want to match newlines, use the appropriate modifier. No biggie.
slim
+2  A: 

Presuming that you have the following:

text1 text2 keyword text3 text4 text5 keyword text6 text7

and what you want is

s/.*?keyword(.*?)keyword.*/keyword$1keyword/;

otherwise you can just replace the whole line with keyword

An example of the data may help us be clearer

Xetius
+3  A: 

I'd use this:

$text =~ s/ .*? (keyword) .* /$1/gx;
Leon Timmermans
A: 

Assuming you want to remove all text to the left of keyword1 and all text to the right of keyword2:

while (<>) {
  s/.*(keyword1)/$1/;
  s/(keyword2).*/$1/;
  print;
}

Put this into a perl script and run it like this:

fix.pl original.txt > new.txt

Or if you just want to do this inplace, perhaps on several files at once:

perl -i.bak -pe 's/.*(keyword1)/$1/; s/(keyword2).*/$1/;' original.txt original2.txt

This will do inplace editing, renaming the original to have a .bak extension, use an implicit while-loop with print and execute the search and replace pattern before each print.

To be safe, verify it without the -i option first, or at the very least on only one file...

Jørn Jensen