views:

196

answers:

4

What I want to do is check an array of strings against my search string and get the corresponding key so I can store it. Is there a magical way of doing this with Perl, or am I doomed to using a loop? If so, what is the most efficient way to do this?

I'm relatively new to Perl (I've only written 2 other scripts), so I don't know a lot of the magic yet, just that Perl is magic =D

Reference Array: (1 = 'Canon', 2 = 'HP', 3 = 'Sony')
Search String: Sony's Cyber-shot DSC-S600
End Result: 3
+3  A: 

UPDATE:

Based on the results of discussion in this question, depending on your intent/criteria of what constitutes "not using a loop", the map based solution below (see "Option #1) may be the most concise solution, provided that you don't consider map a loop (the short version of the answers is: it's a loop as far as implementation/performance, it's not a loop from language theoretical point of view).


Assuming you don't care whether you get "3" or "Sony" as the answer, you can do it without a loop in a simple case, by building a regular expression with "or" logic (|) from the array, like this:

my @strings = ("Canon", "HP", "Sony"); 
my $search_in = "Sony's Cyber-shot DSC-S600"; 
my $combined_search = join("|",@strings); 
my @which_found = ($search_in =~ /($combined_search)/); 
print "$which_found[0]\n";

Result from my test run: Sony

The regular expression will (once the variable $combined_search is interpolated by Perl) take the form /(Canon|HP|Sony)/ which is what you want.

This will NOT work as-is if any of the strings contain regex special characters (such as | or ) ) - in that case you need to escape them

NOTE: I personally consider this somewhat cheating, because in order to implement join(), Perl itself must do a loop somewhere inside the interpeter. So this answer may not satisfy your desire to remain loop-less, depending on whether you wanted to avoid a loop for performance considerations, of to have cleaner or shorter code.


P.S. To get "3" instead of "Sony", you will have to use a loop - either in an obvious way, by doing 1 match in a loop underneath it all; or by using a library that saves you from writing the loop yourself but will have a loop underneath the call.

I will provide 3 alternative solutions.

#1 option: - my favorite. Uses "map", which I personally still consider a loop:

my @strings = ("Canon", "HP", "Sony"); 
my $search_in = "Sony's Cyber-shot DSC-S600"; 
my $combined_search = join("|",@strings); 
my @which_found = ($search_in =~ /($combined_search)/); 
print "$which_found[0]\n";
die "Not found" unless @which_found;
my $strings_index = 0;
my %strings_indexes = map {$_ => $strings_index++} @strings;
my $index = 1 + $strings_indexes{ $which_found[0] };
# Need to add 1 since arrays in Perl are zero-index-started and you want "3"

#2 option: Uses a loop hidden behind a nice CPAN library method:

use List::MoreUtils qw(firstidx);
my @strings = ("Canon", "HP", "Sony"); 
my $search_in = "Sony's Cyber-shot DSC-S600"; 
my $combined_search = join("|",@strings); 
my @which_found = ($search_in =~ /($combined_search)/); 
die "Not Found!"; unless @which_found;
print "$which_found[0]\n";
my $index_of_found = 1 + firstidx { $_ eq $which_found[0] } @strings; 
# Need to add 1 since arrays in Perl are zero-index-started and you want "3"

#3 option: Here's the obvious loop way:

my $found_index = -1;
my @strings = ("Canon", "HP", "Sony"); 
my $search_in = "Sony's Cyber-shot DSC-S600"; 
foreach my $index (0..$#strings) {
    next if $search_in !~ /$strings[$index]/;
    $found_index = $index;
    last; # quit the loop early, which is why I didn't use "map" here
}
# Check $found_index against -1; and if you want "3" instead of "2" add 1.
DVK
@DVK Thanks for this detailed and informative answer :D This info is well written and useful.
Ben Dauphinee
@DVK I have one more question related to this. I would like to implement this with a 2 dimensional array of values to search through, but am not sure how to do that with any but option 3. Suggestions on this? (I edited the question to reflect the new array)
Ben Dauphinee
@Ben - you may want to create it as a new question... (link to this one), so people can benefit in terms of searchability.
DVK
@DVK Done: http://stackoverflow.com/questions/3032373/
Ben Dauphinee
+1  A: 

An easy way is just to use a hash and regex:

my $search = "your search string";
my %translation = (
    'canon' => 1,
    'hp'    => 2,
    'sony'  => 3
);

for my $key ( keys %translation ) {
    if ( $search =~ /$key/i ) {
        return $translation{$key};
    )
}

Naturally the return can just as easily be a print. You can also surround the entire thing in a while loop with:

while(my $search = <>) {
    #your $search is declared = to <> and now gets its values from STDIN or strings piped to this script
}

Please also take a look at perl's regex features at perlre and take a look at perl's data structures at perlref

EDIT

as was just pointed out to me you were trying to steer away from using a loop. Another method would be to use perl's map function. Take a look here.

stocherilac
The OP specifically indicated "or am I doomed to using a loop?" - which to me sounds like he knows he can do it in a loop and is looking for a non-loop answer. I could be mis-reading him
DVK
Thanks for pointing that out, completely missed it.
stocherilac
Heh... of course map can be considered a loop in disguise :)
DVK
+2  A: 

Here is a solution that builds a regular expression with embedded code to increment the index as perl moves through the regex:

my @brands = qw( Canon HP Sony );
my $string = "Sony's Cyber-shot DSC-S600";

use re 'eval';  # needed to use the (?{ code }) construct

my $index = -1;
my $regex = join '|' => map "(?{ \$index++ })\Q$_" => @brands;

print "index: $index\n" if $string =~ $regex;

# prints 2 (since Perl's array indexing starts with 0)

The string that is prepended to each brand first increments the index, and then tries to match the brand (escaped with quotemeta (as \Q) to allow for regex special characters in the brand names).

When the match fails, the regex engine moves past the alternation | and then the pattern repeats.

If you have multiple strings to match against, be sure to reset $index before each. Or you can prepend (?{$index = -1}) to the regex string.

Eric Strom
A: 

You can also take a look at Regexp::Assemble, which will take a collection of sub-regexes and build a single super-regex from them that can then be used to test for all of them at once (and gives you the text which matched the regex, of course). I'm not sure that it's the best solution if you're only looking at three strings/regexes that you want to match, but it's definitely the way to go if you have a substantially larger target set - the project I initially used it on has a library of some 1500 terms that it's matching against and it performs very well.

Dave Sherohman