tags:

views:

96

answers:

3

I'm a perl newbie. I have a code in which a variable is loaded with several values during a foreach loop. What I want to do is to perform some operation on that variable only if its in that array. What is the most efficient way to do this in perl as the data I am working on is very large.

A simple example of my question is, say I have an array of fruits I want

@fruits_i_like = qw (mango banana apple);

But I have a $fruit variable in a foreach loop which gets the name of fruits from a data file that has all different types of fruits. How would I pick only those cases of $fruit that are in my @fruits_i_like array?

+9  A: 

Perl 5.10 or higher?

use strict;
use warnings;
use 5.10.0;
my @fruits_i_like = qw/mango banana apple/;
my $this_fruit = 'banana';
if ( $this_fruit ~~ \@fruits_i_like ) {
     say "yummy, I like $this_fruit!";
}

Before 5.10:

use strict;
use warnings;
my @fruits_i_like = qw/mango banana apple/;
my $this_fruit = 'banana';
if ( scalar grep $this_fruit eq $_, @fruits_i_like ) {
     print "yummy, I like $this_fruit!\n";
}

The downside is that the whole array is parsed through to find matches. This may not be the best option, in which case you can use List::MoreUtils' any(), which returns true once it matches a value and doesn't continue going through the array.

use strict;
use warnings;
use List::MoreUtils qw/any/;
my @fruits_i_like = qw/mango banana apple/;
my $this_fruit = 'banana';
if ( any { $this_fruit eq $_ } @fruits_i_like ) {
     print "yummy, I like $this_fruit!\n";
}

Happy hacking!

mfontani
+7  A: 

This is effectively a lookup problem. It'd be faster to lookup the values of @fruits_i_like in a hash like %fruits_i_like (which is O(1) vs the O(n) of an array).

Convert the array to a hash using the following operation:

open my $data, '<', 'someBigDataFile.dat' or die "Unable to open file: $!";

my %wantedFruits;
@wantedFruits{@fruits_i_like} = ();  # All fruits_i_like entries are now keys

while (my $fruit = <$data>) {        # Iterates over data file line-by-line

     next unless exists $wantedFruits{$fruit};  # Go to next entry unless wanted

     # ... code will reach this point only if you have your wanted fruit
}
Zaid
+9  A: 

You can use a hash like this :

my %h = map {$_ => 1 } @fruits_i_like;
if (exists $h{$this_fruit}) {
    # do stuff
}

Here is a benchmark that compare this way vs mfontani solution

#!/usr/bin/perl 
use warnings;
use strict;
use Benchmark qw(:all);

my @fruits_i_like = qw/mango banana apple/;
my $this_fruit = 'banana';
my %h = map {$_ => 1 } @fruits_i_like;
my $count = -3;
my $r = cmpthese($count, {
    'grep' => sub {
         if ( scalar grep $this_fruit eq $_, @fruits_i_like ) {
             # do stuff
         }
    },
    'hash' => sub {
        if (exists $h{$this_fruit}) {
             # do stuff
        }
    },
});

Output:

          Rate grep hash
grep 1074911/s   -- -76%
hash 4392945/s 309%   --
M42
Change the `sub{}` to `q{}` and run that benchmark again. The subroutine call overhead can change the numbers too much.
tchrist
If you create %h just for this purpose, shouldn't it be part of the benchmark?
Øyvind Skaar
@Øyvind Skaar: I don't think so, because OP wants to match fruits many times. %h is created only once and used many times. It's different from the grep solution where the grep is done for every different fruit.
M42