ansaurus

Question

How do I filter or retain duplicates in Perl?

Answer 1

+2 A:

The following is untested, but should print out only the duplicates.

my $line = "212 212 43 43 5689 6689 5689 71 81\n";
chomp $line;

my %seen;
my @order;
foreach my $elem (split /\s+/, $line) {
  ++$seen{$elem};
  push @order, $elem if $seen{$elem} == 2;
}

foreach my $elem (@order) {
  print "$elem " x $seen{$elem};
}
print "\n";

For removing duplicates, you can now:

print "$_ " for keys %seen;

BUT that doesn't retain the order. You can do something similar as I did for printing out the dupes only. Or use a module like Tie::Hash::Indexed (thanks, daxim) or Tie::IxHash

tsee 2010-05-07 10:04:36

Teaching an old dog a new trick: promote `Tie::Hash::Indexed` over `Tie::IxHash`.

daxim 2010-05-07 13:08:43

Hi thanks for the help :) I modified it a bit and the final code is (Hope someone else will also get benefit)#!/usr/bin/perl#open (MYFILE, "FILENAME");foreach $line (<MYFILE>) { chomp $line; my %seen; my @order; foreach my $elem (split /\s+/, $line) { ++$seen{$elem}; push @order, $elem if $seen{$elem} == 2; } foreach my $elem (@order) { print "$elem " x $seen{$elem}; } print "\n"; }close (MYFILE);Thank u all once again

manu 2010-05-08 06:52:45

Answer 2

A:

For the first part

$ cat prog.pl
#! /usr/bin/perl -lp

my %seen;
$_ = join " " => map $seen{$_}++ ? () : $_ => split;

$ echo 212 212 43 43 5689 6689 5689 71 81 | ./prog.pl
212 43 5689 6689 71 81

For the second part

$ cat prog.pl
#! /usr/bin/perl -lp

my %dups;
my @nums = split;
++$dups{$_} for @nums;

$_ = join " " => grep $dups{$_} > 1 => @nums;

$ cat input
212 212 43 43 5689 6689 5689 71 81
66 66 67 68 69 69 69 71 71 52

$ ./prog.pl input
212 212 43 43 5689 5689
66 66 69 69 69 71 71

Greg Bacon 2010-05-07 14:23:13

ansaurus

tags:

views:

answers:

How do I filter or retain duplicates in Perl?

related questions