ansaurus

Question

Perl pattern matching

Answer 1

A:

You can do:

open F,"<","input.txt" or die; # try to open the file.
while(<F>) { # loop through each line of the file.
        last if(/^message1_end\n$/); # break if message end is found.
        $messsage.=$_ if($start); # append to message
        $start = 1 if(/^message1\n$/); # set start to 1 to start appending.
}

print $messsage;

codaddict 2010-08-24 09:49:51

Answer 2

+3 A:

Maybe this works for you.

open(YOURFILE,"./input.txt");
while (<YOURFILE>) {
        if (/message1/ .. /message1_end/) {
                printf "%s",$_;
        }
}
close(YOURFILE);

Janne Pikkarainen 2010-08-24 09:54:32

this also prints markers, not only text between them

Piotr Maj 2010-08-24 10:01:04

it is not working for me

Senthil kumar 2010-08-24 12:05:22

How is it not working? :-)

Janne Pikkarainen 2010-08-24 12:12:53

Answer 3

A:

Another approach if input file fits into memory:

#!/usr/bin/perl

local $/=undef;
open FILE, "input.txt" or die "Couldn't open file: $!";
$string = <FILE>;
close FILE;

print $1 if ($string =~ /message1(.*)message1_end/sm);

Piotr Maj 2010-08-24 10:08:30

Answer 4

+4 A:

use strict;
use warnings;

open my $fh, '<', 'filename' or die "can't open 'filename' for reading : $!"
while(<$fh>) {
    chomp;
    if(/^message1$/ .. /^message1_end$/) {
        print $_,"\n" unless($_ eq 'message1' or $_ eq 'message1_end');
    }
}
close $fh;

M42 2010-08-24 10:20:10

Congratulations for posting the only answer to use lexical file-handles, the three-argument open and to use proper error handling!

Ether 2010-08-24 16:18:55

@Ether: Thanks a lot

M42 2010-08-24 17:17:28

Answer 5

+1 A:

I don't think we'll get a perfect answer to this question is it's so vague, but here goes.

As perldoc explains, you can use capture buffers to simplify your job. In short, you can reference text groups (blocks inside ()'s) inside the regular expression in the same manner as you do after the initialization. You just reference them by a backslash(\) instead of a dollar sign ($).

This code assumes that you have the entire searchable buffer accessible. If you want to do it on a line-by-line basis you'll need to have a tag counter (or other similar mechanism) to make sure you can handle recursive strings (presuming your message block can in itself contain message blocks)

#!/usr/bin/perl
use warnings;
use strict;

my $buf = 'this is a junk line
this is a junk line2
this is a junk line3
message1
this is first line of text
this is second line of text
this is third line of text
this is fourth line of text
this is fifth line of text
message1_end
the next line';

if($buf =~m/(message\d)(.*?)(\1_end)/sg) {
    my $message = $2;
    # ...
}

Here, \d matches a single digit (see the perldoc link) and \1 evaluates to the same as $1("message1"). As the beginning marker only differs from the end marker by "_end", we use the beginning marker to create the end marker we're looking for. By doing this, the code will work just fine for multiple messages ("message1", "message2", ..).

gamen 2010-08-24 11:01:28

ansaurus

tags:

views:

answers:

Perl pattern matching

related questions