




I have one text string which is having some duplicate characters (FFGGHHJKL). These can be made unique by using the positive lookahead:

$ perl -pe 's/(.)(?=.*?\1)//g']

For example, with "FFEEDDCCGG", the output is "FEDCG".

My question is how to make it work on the numbers (Ex. 212 212 43 43 5689 6689 5689 71 81 === output should be 212 43 5689 6689 71 81) ? Also if we want to have only duplicate records to be given as the output from a file having n rows

212 212 43 43 5689 6689 5689 71 81
66 66 67 68 69 69 69 71 71 52


212 212 43 43 5689 5689
66 66 69 69 69 71 71

How can I do this?

+2  A: 

The following is untested, but should print out only the duplicates.

my $line = "212 212 43 43 5689 6689 5689 71 81\n";
chomp $line;

my %seen;
my @order;
foreach my $elem (split /\s+/, $line) {
  push @order, $elem if $seen{$elem} == 2;

foreach my $elem (@order) {
  print "$elem " x $seen{$elem};
print "\n";

For removing duplicates, you can now:

print "$_ " for keys %seen;

BUT that doesn't retain the order. You can do something similar as I did for printing out the dupes only. Or use a module like Tie::Hash::Indexed (thanks, daxim) or Tie::IxHash

Teaching an old dog a new trick: promote `Tie::Hash::Indexed` over `Tie::IxHash`.
Hi thanks for the help :) I modified it a bit and the final code is (Hope someone else will also get benefit)#!/usr/bin/perl#open (MYFILE, "FILENAME");foreach $line (<MYFILE>) { chomp $line; my %seen; my @order; foreach my $elem (split /\s+/, $line) { ++$seen{$elem}; push @order, $elem if $seen{$elem} == 2; } foreach my $elem (@order) { print "$elem " x $seen{$elem}; } print "\n"; }close (MYFILE);Thank u all once again

For the first part

$ cat prog.pl
#! /usr/bin/perl -lp

my %seen;
$_ = join " " => map $seen{$_}++ ? () : $_ => split;

$ echo 212 212 43 43 5689 6689 5689 71 81 | ./prog.pl
212 43 5689 6689 71 81

For the second part

$ cat prog.pl
#! /usr/bin/perl -lp

my %dups;
my @nums = split;
++$dups{$_} for @nums;

$_ = join " " => grep $dups{$_} > 1 => @nums;

$ cat input
212 212 43 43 5689 6689 5689 71 81
66 66 67 68 69 69 69 71 71 52

$ ./prog.pl input
212 212 43 43 5689 5689
66 66 69 69 69 71 71
Greg Bacon