tags:

views:

1187

answers:

7

Repost from Perlmonks for a coworker:

I wrote a perl script to separate long lists of email separated by a semi colon. What I would like to do with the code is combine the split with the trimming of white space so I don't need two arrays. Is there away to trim while loading the first array. Output is a sorted list of names. Thanks.

#!/pw/prod/svr4/bin/perl
use warnings;
use strict;

my $file_data =
  'Builder, Bob  ;Stein, Franklin MSW; Boop, Elizabeth PHD   Cc: Bear,
+ Izzy';
my @email_list;

$file_data =~ s/CC:/;/ig;
$file_data =~ s/PHD//ig;
$file_data =~ s/MSW//ig;

my @tmp_data = split( /;/, $file_data );

foreach my $entry (@tmp_data) {
    $entry =~ s/^[ \t]+|[ \t]+$//g;
    push( @email_list, $entry );
}

foreach my $name ( sort(@email_list) ) {
    print "$name \n";
}
+10  A: 

If you don't need to trim the first and final element, this will do the trick:

@email_list = split /\s*;\s*/, $file_data;

If you do need to trim the first and final element, trim $file_data first, then repeat as above. :-P

Chris Jester-Young
+2  A: 

Well, you can do what Chris suggested, but it doesn't handle leading and trailing spaces in $file_data.

You can add handling of these like this:

$file_data =~ s/\A\s+|\s+\z//g;

Also, please note that using 2nd array was not necessary. Check this:

my $file_data = 'Builder, Bob  ;Stein, Franklin MSW; Boop, Elizabeth PHD   Cc: Bear, Izzy';
my @email_list;

$file_data =~ s/CC:/;/ig;
$file_data =~ s/PHD//ig;
$file_data =~ s/MSW//ig;

my @tmp_data = split( /;/, $file_data );

foreach my $entry (@tmp_data) {
    $entry =~ s/^[ \t]+|[ \t]+$//g;
}

foreach my $name ( sort(@tmp_data) ) {
    print "$name \n";
}
depesz
A: 

Barring some minor sintax error, this should do the whole work for you. Oh, list operations, how beautiful you are!

print join (" \n", sort { $a <=> $b } map { s/^[ \t]+|[ \t]+$//g } split (/;/, $file_data));
Kristoffon
Having map return the result of s/// is not very useful. Try map { s/...//g; $_ }
ysth
And you probably don't mean a numeric sort.
ysth
+7  A: 

You don't have to do both operations in one go using the same function. Sometimes performing the actions separately can be more clear. That is, split first, then strip the whitespace off of each element (and then sort the result):

@email_list =
    sort(
        map {
                s/\s*(\S+)\s*/\1/; $_
            }
            split ';', $file_data
    );

EDIT: Stripping more than one part of a string at the same time can lead to pitfalls, e.g. Sinan's point below about leaving trailing spaces in the "Elizabeth" portion. I coded that snippet with the assumption that the name would not have internal whitespace, which is actually quite wrong and would have stood out as incorrect if I had consciously noticed it. The code is much improved (and also more readable) below:

@email_list =
    sort(
        map {   
                s/^\s+//;  # strip leading spaces
                s/\s+$//;  # strip trailing spaces
                $_         # return the modified string
            }
            split ';', $file_data
    );
Ether
First, you should use `$1` rather than `\1` in the replacement part of `s///`. Second, this leaves trailing spaces in the names: `'Bear,+ Izzy'``'Boop,Elizabeth '``'Builder,Bob '``'Stein,Franklin '`
Sinan Ünür
Apparently cannot have multiple spaces in comments but there are for spaces after *Elizabeth*.
Sinan Ünür
> First... Second... very good points! Edited response above.
Ether
+1 Thanks for making the corrections.
Sinan Ünür
+1  A: 
my @email_list = map { s/^[ \t]+|[ \t]+$//g; $_ } split /;/, $file_data;

or the more elegant:

use Algorithm::Loops "Filter";
my @email_list = Filter { s/^[ \t]+|[ \t]+$//g } split /;/, $file_data;
ysth
A: 

My turn:

my @fields = grep { $_ } split m/\s*(?:;|^|$)\s*/, $record;

It also strips the first and last elements as well. If grep is overkill for getting rid of the first element:

my ( undef, @fields ) = split m/\s*(?:;|^|$)\s*/, $record;

works if you know that there is a space, but that's not likely, so

my @fields = split m/\s*(?:;|^|$)\s*/, $record;
shift @fields unless $fields[0];

is the most sure way to do it.

Axeman
+1  A: 

See How do I strip blank space from the beginning/end of a string? in the FAQ.

@email_list = sort map {
    s/^\s+//; s/\s+$//; $_
} split ';', $file_data;

Now, note also that a for loop aliases each element of an array, so

@email_list = sort split ';', $file_data;

for (@email_list) {
    s/^\s+//;
    s/\s+$//;
}

would also work.

Sinan Ünür