tags:

views:

204

answers:

3

I have two arrays. I need to check and see if the elements of one appear in the other one.

Is there a more efficient way to do it than nested loops? I have a few thousand elements in each and need to run the program frequently.

-Alex

+6  A: 

perlfaq4 to the rescue:

How do I compute the difference of two arrays? How do I compute the intersection of two arrays?

Use a hash. Here's code to do both and more. It assumes that each element is unique in a given array:

    @union = @intersection = @difference = ();
    %count = ();
    foreach $element (@array1, @array2) { $count{$element}++ }
    foreach $element (keys %count) {
            push @union, $element;
            push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element;
    }
mobrule
The code sample might not quite be the answer to your question, but the advice is right on: use a hash.
mobrule
Any particular reason why you've linked to the documentation on `CPAN` and not `perldoc`?
Zaid
1) http://search.cpan.org/perldoc/ covers all of CPAN, not just the core docs and modules. 2) I personally prefer the look and feel of the CPAN site to perldoc.perl.org. If you like perl.org better, then that's OK too.
mobrule
+2  A: 

You need to provide a lot more context. There are more efficient ways of doing that ranging from:

  • Go outside of Perl and use shell (sort + comm)

  • map one array into a Perl hash and then loop over the other one checking hash membership. This has linear complexity ("M+N" - basically loop over each array once) as opposed to nested loop which has "M*N" complexity)

    Example:

    my %second = map {$_=>1} @second;
    my @only_in_first = grep { !$second{$_} } @first; 
    # use a foreach loop with `last` instead of "grep" 
    # if you only want yes/no answer instead of full list
    
  • Use a Perl module that does the last bullet point for you (List::Compare was mentioned in comments)

  • Do it based on timestamps of when elements were added if the volume is very large and you need to re-compare often. A few thousand elements is not really big enough, but I recently had to diff 100k sized lists.

DVK
+1  A: 

Another way to do it is to use Array::Utils

    use Array::Utils qw(:all);

    my @a = qw( a b c d );
    my @b = qw( c d e f );

    # symmetric difference
    my @diff = array_diff(@a, @b);

    # intersection
    my @isect = intersect(@a, @b);

    # unique union
    my @unique = unique(@a, @b);

    # check if arrays contain same members
    if ( !array_diff(@a, @b) ) {
            # do something
    }

    # get items from array @a that are not in array @b
    my @minus = array_minus( @a, @b );
Nifle