tags:

views:

76

answers:

1

This simple Perl script is translating stories from a database into XML, but this one section is giving me problems. The function makeUrl is called for each story, but needs to ensure that duplicate URLs aren't created.

my @headlines = ();
my $hlCount = 1;
.
.
.

sub makeUrl {
  my $headline;
  open( URLSOUT, '>>/var/mtkoan/harris/urls' );

  $url = $_[0];
  print URLSOUT "Before: $url\n";
  $url =~ s/\x{90}//g;
  $url =~ s/\s+$//g;
  $url =~ s/^\s+//g;
  $url =~ s/\s/_/g;
  $url =~ s/\W//g;

  push @headlines, $url;
  foreach $headline (@headlines) {
    if( $headline eq $url ) {
      $url .= "_$hlCount";
      $hlCount++;
    }
  }

  print URLSOUT "After: $url\n\n";
  print URLSOUT "Headline Array Dump:\n";
  print URLSOUT "@headlines\n";
  close URLSOUT;
}

When the array is printed, only the last value is shown. Push isn't appending to the end of the array, I can't figure it out!

A: 

You can check for uniqueness (and remove duplicates from a list) in two main ways:

With a hash:

    my %urls;
    # construct your URL in the function...
    $urls{$url}++;

    # get all the (unique) URLs:
    my @urls = keys %urls;

With a library call that returns the unique values in a list (see List::MoreUtils):

    use List::MoreUtils 'uniq`;
    @urls = uniq @urls;
Ether