tags:

views:

242

answers:

4

I've already learnt how to remove duplicates in Perl using the following code:

my %seen = ();
my @unique = grep { ! $seen{ $_}++ } @array;

But what about if I want to merge the overlapping parts? Is there a simple way like the above code to directly do the job?

For example a bit of the input file looks something like this:

Anais Nin   :  People living deeply have no fear of death.
Pascal      :  Wisdome sends us back to our childhood.
Nietzsche   :  No one lies so boldly as the man who is indignant. 
Camus       :  Stupidity has a knack of getting its way. 
Plato      :  A good decision is based on knowledge and not on numbers. 
Anais Nin   :  We don't see things as they are, we see them as we are. 
Erich Fromm     :  Creativity requires the courage to let go of certainties. 
M. Scott Peck   :  Share our similarities, celebrate our differences.
Freud    :  The ego is not master in its own house. 
Camus    :  You cannot create experience. You must undergo it. 
Stendhal    :  Pleasure is often spoiled by describing it. 

The desire output looks like this:

Anais Nin   :  People living deeply have no fear of death. We don't see things as they are, we see them as we are. 
Pascal      :  Wisdome sends us back to our childhood.
Nietzsche   :  No one lies so boldly as the man who is indignant. 
Camus       :  Stupidity has a knack of getting its way.  You cannot create experience. You must undergo it. 
Plato      :  A good decision is based on knowledge and not on numbers. 
Erich Fromm     :  Creativity requires the courage to let go of certainties. 
M. Scott Peck   :  Share our similarities, celebrate our differences.
Freud    :  The ego is not master in its own house. 
Stendhal    :  Pleasure is often spoiled by describing it. 

Thanks, as always, for any guidance !

+7  A: 

This is a very simple application of regular expressions and hashes. I put your data into a file called "merge.txt". This prints the result to standard output.

#! perl
use warnings;
use strict;
open my $input, "<", "merge.txt" or die $!;
my %name2quotes;
while (my $line = <$input>) {
    if ($line =~ /(.*?)\s*:\s*(.*?)\s*$/) {
        my $name = $1;
        my $quote = $2;
        if ($name2quotes{$name}) {
            $name2quotes{$name} .= " " . $quote;
        } else {
            $name2quotes{$name} = $quote;
        }
    } # You might want to put an "else" here to check for errors.
}
close $input or die $!;
for my $name (sort keys %name2quotes) {
    print "$name : $name2quotes{$name}\n";
}
Kinopiko
Tested okay! For me, it is not simple at all. Thanks for the lesson :)
Mike
You might also want to add an `else` after the `if` to check if there was an error parsing the line.
Kinopiko
+2  A: 
while (<>) {
    ($F1,$F2) = split(/[:\n]/, $_);
    $F1 =~ s/[[:space:]]+//g;
    if (!(defined $a{$F1})) {
        $a{$F1} = $F2;
    }
    else {
        $a{$F1} = "$a{$F1} $F2";
    }
}
foreach $i (keys %a) {
    print $i, $a{$i} . "\n";
}

output

 $ perl test.pl file
    Freud  The ego is not master in its own house.
    ErichFromm  Creativity requires the courage to let go of certainties.
    Camus  Stupidity has a knack of getting its way.    You cannot create experience. You must undergo it.
    M.ScottPeck  Share our similarities, celebrate our differences.
    Plato  A good decision is based on knowledge and not on numbers.
    Pascal  Wisdome sends us back to our childhood.
    Nietzsche  No one lies so boldly as the man who is indignant.
    AnaisNin  People living deeply have no fear of death.   We don't see things as they are, we see them as we are.
    Stendhal  Pleasure is often spoiled by describing it.
ghostdog74
@ghostdog74, this also works. Thanks for sharing the code :) I'm not sure but the line "$FS = ':';' does not seem to be useful.
Mike
+3  A: 

You can concatenate the quotations without testing for existence of the hash element. Perl will auto-vivify the hash element if it doesn't exist yet.

my %lib;
for (<DATA>){
    chomp;
    my ($au, $qu) = split /\s+:\s+/, $_, 2;
    $lib{$au} .= ' ' . $qu;
}

print $_, " : ", $lib{$_}, "\n" for sort keys %lib;

__DATA__
# Not shown.
FM
Wow, this code is really impressive. Thanks for sharing, FM!
Mike
+1  A: 

I've just browsed through the other Perl-related posts and threads on SO and found Schwern's answer to a question titled "How do I load a file into a Perl hash?" can actually solve my problem. Looks like different people may phrase the same question quite differently.

With a few necessary modifications and addition of print hash instructions, I came up with the following working code:

#!perl
use warnings;
use autodie;
use strict;

open my $quotes,'<','c:/quotes.txt';
my %hash;
while (<$quotes>)
{
   chomp;
   my ($au, $qu) = split /\s+:\s+/, $_, 2;
   $hash{$au} .= exists $hash{$au}? "$qu" : $qu;

}
print map { "$_ : $hash{$_}\n" } keys %hash;
Mike