views:

12297

answers:

7

I have an array in Perl:

@my_array = ("one","two","three","two","three")

How do I remove the duplicates from the array?

A: 

It was quite a time ago with Perl...

But what about go through the array with a loop and check for duplicates? I know, it's not the ideal, because if the size of the array is big, it could take some time.

Has it to be an array? You can try with a hash table.

Or the combination of the two above: take the elements of the array one by one and try to insert it to a hash table. If the insert fails: it is a duplicate, and you can remove it from the array.

Biri
+18  A: 

You can do something like this:

sub uniq {
    return keys %{{ map { $_ => 1 } @_ }};
}

@my_array = ("one","two","three","two","three");
print join(" ", @my_array), "\n";
print join(" ", uniq(@my_array)), "\n";

That destroys the original order of the items in the array, though.

Update: Here's an easier to understand function that preserves the original order:

sub uniq2 {
    my %seen = ();
    my @r = ();
    foreach my $a (@_) {
        unless ($seen{$a}) {
            push @r, $a;
            $seen{$a} = 1;
        }
    }
    return @r;
}
Greg Hewgill
please don't use $a or $b in examples as they are the magic globals of sort()
szabgab
It's a `my` lexical in this scope, so it's fine. That being said, possibly a more descriptive variable name could be chosen.
ephemient
+4  A: 

My usual was of doing this is:

my %unique = ();
foreach my $item (@myarray)
{
    $unique{$item} ++;
}
my @myuniquearray = keys %unique;

If you use a hash and add the items to the hash. You also have the bonus of knowing how many times each item appears in the list

Xetius
+30  A: 

The Perl documentation comes with a nice collection of FAQs. Your question is frequently asked:

% perldoc -q duplicate

The answer, copy and pasted from the output of the command above, appears below:

Found in /usr/local/lib/perl5/5.10.0/pods/perlfaq4.pod
 How can I remove duplicate elements from a list or array?
   (contributed by brian d foy)

   Use a hash. When you think the words "unique" or "duplicated", think
   "hash keys".

   If you don't care about the order of the elements, you could just
   create the hash then extract the keys. It's not important how you
   create that hash: just that you use "keys" to get the unique elements.

       my %hash   = map { $_, 1 } @array;
       # or a hash slice: @hash{ @array } = ();
       # or a foreach: $hash{$_} = 1 foreach ( @array );

       my @unique = keys %hash;

   If you want to use a module, try the "uniq" function from
   "List::MoreUtils". In list context it returns the unique elements,
   preserving their order in the list. In scalar context, it returns the
   number of unique elements.

       use List::MoreUtils qw(uniq);

       my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7
       my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7

   You can also go through each element and skip the ones you've seen
   before. Use a hash to keep track. The first time the loop sees an
   element, that element has no key in %Seen. The "next" statement creates
   the key and immediately uses its value, which is "undef", so the loop
   continues to the "push" and increments the value for that key. The next
   time the loop sees that same element, its key exists in the hash and
   the value for that key is true (since it's not 0 or "undef"), so the
   next skips that iteration and the loop goes to the next element.

       my @unique = ();
       my %seen   = ();

       foreach my $elem ( @array )
       {
         next if $seen{ $elem }++;
         push @unique, $elem;
       }

   You can write this more briefly using a grep, which does the same
   thing.

       my %seen = ();
       my @unique = grep { ! $seen{ $_ }++ } @array;
John Siracusa
http://perldoc.perl.org/perlfaq4.html#How-can-I-remove-duplicate-elements-from-a-list-or-array%3F
szabgab
John iz in mah anzers stealing mah rep!
brian d foy
I think you should get bonus points for actually looking the question up.
Brad Gilbert
+11  A: 

Install List::MoreUtils from CPAN

Then in your code:

use List::MoreUtils qw(uniq);

my @dup_list = qw(1 1 1 2 3 4 4);

my @uniq_list = uniq(@dups);
Ranguard
That's the answer! But I can only vote you up once.
Axeman
A: 

That last one was pretty good. I'd just tweak it a bit:

my @arr;
my @uniqarr;

foreach my $var ( @arr ){
   if ( ! grep( /$var/, @uniqarr ) ){
      push( @uniqarr, $var );
      }
   }

I think this is probably the most readable way to do it.

A: 

@array is the list with duplicate elements

%seen=(); @unique = grep { ! $seen{$_} ++ } @array;

Sreedhar