views:

177

answers:

8

Does anyone have a solution to the task of processing a multi-line string one line at a time, other than the string-as-a-filehandle solution shown below?

my $multiline_string = "line one\nline two\nline three\nline four";
my $filehandle;
open( $filehandle, '<', \$multiline_string )
    or croak("Can't open multi-line string as a filehandle: $!");
while ( defined (my $single_line = <$filehandle>) ) {
    # do some processing of $single_line here ...
}
close( $filehandle );

My reason for not wanting to use a filehandle is pretty weak. Test::Perl::Critic whines when I have more than 10 source lines between my open command and my close command on any filehandle. I'm doing quite a bit of processing of $single_line so I actually have about 40 lines of code between my open call and my close call and I don't see any way to bring that down to 10.

And I don't really want to ignore the Perl::Critic test in my build because that's actually a decent test that I'd like to pass whenever I'm opening an actual disk file in my code.

+3  A: 

What about:

my $multiline_string = "line one\nline two\nline three\nline four";
my @lines = split(/\n/,$multiline_string);
foreach my $line (@lines) {
    #do stuff with string
}
Tom
+3  A: 

I might be missing something, but could you do:

my @lines = split(/\n/,$multiline_string);
foreach my $single_line (@lines) {
  ...
}
Salgar
Don't forget you can process a multiline string with regexps using the /m or /s option, as described in perldoc perlre -- this may be easier than splitting on \n, depending on what you're searching for.
Ether
+3  A: 

Long before I even knew you could shoehorn a multiline string into a filehandle, there was split:

foreach my $single_line (split /\n/, $multiline_string) {
    # process $single_line here
    # although note that it doesn't end in a newline anymore
}

Insert disclaimer about using literal and non-portable \n here.

mobrule
+7  A: 

Make the Perl Critic happy, and make yourself even happier, by creating a subroutine, and calling it with each line of the file.

use strict; use warnings;

sub do_something {
    my ($line) = @_;
    # do something with $line
}

open my $fh, '<', \$multiline_string
    or die "Cannot open scalar for reading: $!";

while(<$fh>) {
    chomp;
    do_something($_);
}

close $fh;
Jonathan Feinberg
This is definitely the right way to do it. However, you should **always** check if open succeeded and use the 3-arg form of open with lexical filehandles.
Sinan Ünür
Or at least **always** use 3-arg open. You can "use autodie" (along with strict and warnings) and if you don't want to bother with checking whether your opens succeed.
Dave Sherohman
I thank my editors. I haven't used Perl for 4 years, so...
Jonathan Feinberg
A: 

You could use a regex.

#!/usr/bin/perl

use strict;
use warnings;

my $s = "line one\nline two\nline three\nline four";

while ($s =~ m'^(.*)$'gm) {
    print "'$1'\n";
}

die "Exited loop too early\n" unless pos $s == length $s;

Or you could use split:

for my $line ( split m'\n', $multiline_string ){

  # ...

}
Brad Gilbert
The regular expression approach is best IMHO. You do not need `\G` and `/m`. Use: `while ( $s =~ /(.+?)\n/g ) {`. `split` is wasteful because it would mean keeping two copies of essentially the same data in memory.
Sinan Ünür
*, not + there, or you'd skip empty lines. And ? is useless. \n belongs in the capture to be more like the filehandle read way.
ysth
And while \G may be unneeded, I'd keep it; when you expect to consume all string piecemeal, it's best to enforce it (with m/\G.../gc and a pos() check after the loop) so you don't accidentally miswrite your regex and lose some of the data (like your + instead of *).
ysth
@ysth Note that the OP's string does not end with a `\n`. To process that string correctly `+` would be needed and `\n` would have to be optional.
Sinan Ünür
@Sinan Ünür: then you'd need /\G(?:.*\n|.+)/gc (or some variant; many ways to do it). But I wouldn't be surprised if the real data had a newline at the end.
ysth
A: 

Personally I like using $/ to separate the lines in a multiline string.

my $multiline_string = "line one\nline two\nline three\nline four";
foreach (split($/, $mutliline_string)) {
  process_file($_);
}
sub process_file {
  my $filename = shift;
  my $filehandle;
  open( $filehandle, '<', $filename )
      or croak("Can't open multi-line string as a filehandle: $!");
  while ( defined (my $single_line = <$filehandle>) ) {
      process_line($single_line);
  }
  close( $filehandle );
}
sub process_line {
  my $line = shift;
  ...
}
dlamblin
+5  A: 

Um, isn't the purpose of the whine to get you to have smaller blocks of code that do just one thing? make a subroutine that does what's needed for each line.

Many people have suggested split /\n/. split /^/ is more like the filehandle way.

ysth
+2  A: 

Perl::Critic is nice, but when you start obsessing about some of its arbitary requirements, it starts to waste your time rather than save it. I just let the filehandle go out of scope and don't worry about the close:

 my $multiline_string = "line one\nline two\nline three\nline four";

 {
     open my( $fh ), '<', \$multiline_string )
         or croak("Can't open multi-line string as a filehandle: $!");
     while ( defined (my $single_line = <$fh>) ) {
         # do some processing of $single_line here ...
     }
 }

A lot of people reach for regexes or split, but I think that's sloppy. You don't need to create a new list and use up a lot more memory in your program.

brian d foy