views:

66

answers:

5

Hi All,

I have a text file like this
this is a junk line
this is a junk line2
this is a junk line3
message1
this is first line of text
this is second line of text
this is third line of text
this is fourth line of text
this is fifth line of text
message1_end
the next line

I want to start pattern match from message1 onwards and then print the text present between message1 and message1_end, after that pattern match should be stopped.

How to do this in perl ??

Thanks in advance

Senthil.

A: 

You can do:

open F,"<","input.txt" or die; # try to open the file.
while(<F>) { # loop through each line of the file.
        last if(/^message1_end\n$/); # break if message end is found.
        $messsage.=$_ if($start); # append to message
        $start = 1 if(/^message1\n$/); # set start to 1 to start appending.
}

print $messsage;
codaddict
+3  A: 

Maybe this works for you.

open(YOURFILE,"./input.txt");
while (<YOURFILE>) {
        if (/message1/ .. /message1_end/) {
                printf "%s",$_;
        }
}
close(YOURFILE);
Janne Pikkarainen
this also prints markers, not only text between them
Piotr Maj
it is not working for me
Senthil kumar
How is it not working? :-)
Janne Pikkarainen
A: 

Another approach if input file fits into memory:

#!/usr/bin/perl

local $/=undef;
open FILE, "input.txt" or die "Couldn't open file: $!";
$string = <FILE>;
close FILE;

print $1 if ($string =~ /message1(.*)message1_end/sm);
Piotr Maj
+4  A: 
use strict;
use warnings;

open my $fh, '<', 'filename' or die "can't open 'filename' for reading : $!"
while(<$fh>) {
    chomp;
    if(/^message1$/ .. /^message1_end$/) {
        print $_,"\n" unless($_ eq 'message1' or $_ eq 'message1_end');
    }
}
close $fh;
M42
Congratulations for posting the only answer to use lexical file-handles, the three-argument open and to use proper error handling!
Ether
@Ether: Thanks a lot
M42
+1  A: 

I don't think we'll get a perfect answer to this question is it's so vague, but here goes.

As perldoc explains, you can use capture buffers to simplify your job. In short, you can reference text groups (blocks inside ()'s) inside the regular expression in the same manner as you do after the initialization. You just reference them by a backslash(\) instead of a dollar sign ($).

This code assumes that you have the entire searchable buffer accessible. If you want to do it on a line-by-line basis you'll need to have a tag counter (or other similar mechanism) to make sure you can handle recursive strings (presuming your message block can in itself contain message blocks)

#!/usr/bin/perl
use warnings;
use strict;

my $buf = 'this is a junk line
this is a junk line2
this is a junk line3
message1
this is first line of text
this is second line of text
this is third line of text
this is fourth line of text
this is fifth line of text
message1_end
the next line';

if($buf =~m/(message\d)(.*?)(\1_end)/sg) {
    my $message = $2;
    # ...
}

Here, \d matches a single digit (see the perldoc link) and \1 evaluates to the same as $1("message1"). As the beginning marker only differs from the end marker by "_end", we use the beginning marker to create the end marker we're looking for. By doing this, the code will work just fine for multiple messages ("message1", "message2", ..).

gamen