tags:

views:

105

answers:

6

How do I use map with the split function to trim the constituents: $a, $b, $c and $d; of $line?

my ($a, $b, $c, $d, $e) = split(/\t/, $line);

# Perl trim function to remove whitespace from the start and end of the string
sub trim($)
{
    my $string = shift;
    $string =~ s/^\s+//;
    $string =~ s/\s+$//;
    return $string;
}
+2  A: 

This should work:

my ($a, $b, $c, $d, $e) = map {trim ($_)} (split(/\t/, $line));

By the way, it's a minor point, but you should not use $a and $b as variable names.

Kinopiko
+2  A: 

map takes two inputs:

  • an expression or block: this would be the trim expression (you don't have to write your own -- it's on CPAN)
  • and a list to operate on: this should be split's output:
use String::Util 'trim';
my @values = map { trim($_) } split /\t/, $line;
Ether
I'm nervous about introducing a dependence on a module which says "Final version. As of this version String::Util is no longer under development or being supported."
Kinopiko
If we are going to install a CPAN module, we might as well use the one that does the job the best: [`String::Strip`](http://p3rl.org/String::Strip). See http://www.illusori.co.uk/perl/2010/03/05/advanced_benchmark_analysis_1.html
daxim
I haven't tried it, but the acid test for these kinds of modules is whether they strip out things like Unicode 0x3000 from the string. If not then maybe it is not a good replacement. Glancing at the source code, String::Strip uses the C function `isspace` to strip spaces and has no awareness of unicode, so it will behave differently from the above.
Kinopiko
+3  A: 

Don't use prototypes the ($) on your function unless you need them.

my ( $a, $b, $c, $d, $e ) =
  map {s/^\s+|\s+$//g; $_}    ## Notice the `, $_` this is common
  , split(/\t/, $line, 5)
;

Don't forget in the above s/// returns the replacement count -- not $_. So, we do that explicitly.

or more simply:

my @values = map {s/^\s+|\s+$//g; $_}, split(/\t/, $line, 5), $line
Evan Carroll
why the down vote?
Evan Carroll
I don't know why the downvote, but you've forgotten the g at the end in the final line.
Kinopiko
Why do you recommend against prototypes?
10rd_n3r0
Read this: http://stackoverflow.com/questions/297034/why-are-perl-function-prototypes-bad , add to it, that no one else does, and they add line noise. You only need them if your trying to create a level of sugar that gives you a different non-perlish look or feel. They don't really handle much else, and they don't work at all on methods.
Evan Carroll
+1  A: 

Just for variety:

my @trimmed = grep { s/^\s*|\s*$//g } split /\t/, $line;

grep acts as a filter on lists. This is why the \s+s need to be changed to \s*s inside the regex. Forcing matches on 0 or more spaces prevents grep from filtering out items in the list that have no leading or trailing spaces.

Zaid
But it wouldn't include segments that were surrounded by tabs with no spaces. `"\tspoon\t"` would be omitted.
Axeman
@Axeman : From [`perlretut`](http://perldoc.perl.org/perlretut.html): "`\s` matches a whitespace character, the set `[\ \t\r\n\f]` and others." Besides, aren't we splitting on `\t` here ;)?
Zaid
@Zaid, yes--but never mind, my eyes replaced `\s*` with my usual `\s+`. So the subst always matches and I don't know what I'm talking about. :D
Axeman
+2  A: 

You can also use "foreach" here.

foreach my $i ($a, $b, $c, $d, $e) {
  $i=trim($i);
}
Alexandr Ciornii
A: 

When I trim a string, I don't often want to keep the original. It would be nice to have the abstraction of a sub but also not have to fuss with temporary values.

It turns out that we can do just this, as perlsub explains:

Any arguments passed in show up in the array @_. Therefore, if you called a function with two arguments, those would be stored in $_[0] and $_[1]. The array @_ is a local array, but its elements are aliases for the actual scalar parameters. In particular, if an element $_[0] is updated, the corresponding argument is updated (or an error occurs if it is not updatable).

In your case, trim becomes

sub trim {
  for (@_) {
    s/^ \s+  //x;
    s/  \s+ $//x;
  }
  wantarray ? @_ : $_[0];
}

Remember that map and for are cousins, so with the loop in trim, you no longer need map. For example

my $line = "1\t 2\t3 \t 4 \t  5  \n";    
my ($a, $b, $c, $d, $e) = split(/\t/, $line);    

print "BEFORE: [", join("] [" => $a, $b, $c, $d), "]\n";
trim $a, $b, $c, $d;
print "AFTER:  [", join("] [" => $a, $b, $c, $d), "]\n";

Output:

BEFORE: [1] [ 2] [3 ] [ 4 ]
AFTER:  [1] [2] [3] [4]
Greg Bacon
Could you explain the '=>' usage of join? I've never seen that before?
10rd_n3r0
@10rd_n3r0, I'll take a stab at it, the first parameter in join is a special one, and the others are all treated the same, the "fat comma" ( `=>` ) just gives better visual separation than `,`. I've used it myself for that reason. Such as in this case `keyword => qw<a list of words>` nothing differentiates them in the list created, but I'm showing the way I'm thinking of them. I use if for those times that I want the visual separation to portray what the semantics.
Axeman
@10rd I used it for visual separation as @Axeman described. As I was writing it up, I originally had `join(", " => ...)`, and whenever the separator contains a comma, I like to use the fat comma for readability.
Greg Bacon
More readable, perhaps : `print "BEFORE: ". join ' ', map { '['.$_.']' } ($a, $b, $c, $d);`
Zaid