views:

306

answers:

4

I am a novice Perl programmer and would like some help. I have an array list that I am trying to split each element based on the pipe into two scalar elements. From there I would like to spike out only the lines that read ‘PJ RER Apts to Share’ as the first element. Then I want to print out the second element only once while counting each time the element appears. I wrote the piece of code below but can’t figure out where I am going wrong. It might be something small that I am just overlooking. Any help would be greatly appreciated.

## CODE ##

my @data = ('PJ RER Apts to Share|PROVIDENCE',  
        'PJ RER Apts to Share|JOHNSTON',  
        'PJ RER Apts to Share|JOHNSTON',  
        'PJ RER Apts to Share|JOHNSTON',  
        'PJ RER Condo|WEST WARWICK',  
        'PJ RER Condo|WARWICK');  

foreach my $line (@data) {  
    $count = @data;  
    chomp($line);  
    @fields = split(/\|/,$line);  
    if (($fields[0] =~ /PJ RER Apts to Share/g)){  
        @array2 = $fields[1];  
        my %seen;  
        my @uniq = grep { ! $seen{$_}++ } @array2;  
        my $count2 = scalar(@uniq);  
        print "$array2[0] ($count2)","\n"  
    }  
}  
print "$count","\n";  

## OUTPUT ##

PROVIDENCE (1)  
JOHNSTON (1)  
JOHNSTON (1)  
JOHNSTON (1)  
6  
+3  A: 

This is very crude, but I'd use Perl's awesome hash arrays to help with this task. I'd take the entire record and use it to index the hash array and an increment to the value.

foreach (@array) {
   $myHash{$_}++;
}

When it's done, cycle through your hash array and you'll have unique and duplicate records alike counted from the increment counter.

Like I said this is very crude and I'm sure there are many issues with the approach. All ye Perl gods fire away.

gurun8
That's not crude at all, it is the correct and idiomatic way to do it.
friedo
+2  A: 

I used the following script:

my %elements = ( );

foreach (@data) {
   chomp;
   my ($f0, $f1) = split(/\|/);
   $elements{ $f0 }{ $f1 }++;
}

while ( my ( $k, $v ) = each( %elements ) )
{
   print "Key [$k] :\n";
   while ( my ( $field2, $count ) = each( %$v ) )
   {
      print "  Field [$field2] appeared $count times\n";
   }
}

And it yielded:

Key [PJ RER Condo] :
  Field [WARWICK] appeared 1 times
  Field [WEST WARWICK] appeared 1 times
Key [PJ RER Apts to Share] :
  Field [JOHNSTON] appeared 3 times
  Field [PROVIDENCE] appeared 1 times

Is this what you were looking for?

Phil
Does this sort the fields or does it spit back the data in a random order?
Luke
If you want it to sort, then say `each( sort %$v )` in the last `while` loop. Hope it helps!
Phil
+3  A: 

You can use the uniq function in List::MoreUtils to remove duplicate entries from a list. The number of elements in a list or array can be easily found by evaluating the list in scalar context:

use strict; use warnings;
use List::MoreUtils 'uniq';
my @list = qw(1 1 2 3 5 8);

my @uniq = uniq @list;
print 'list with dupes removed: ', join(', ', @uniq), "\n";
print 'number of elements in this list: ', scalar(@uniq), "\n";
list with dupes removed: 1, 2, 3, 5, 8
number of elements in this list: 5
Ether
A: 

Accumulate the number of occurrence per city in a hash. The key will be the city name and the value will be the count. Then sort the keys and output them and their corresponding values:

my @data = ('PJ RER Apts to Share|PROVIDENCE',  
    'PJ RER Apts to Share|JOHNSTON',  
    'PJ RER Apts to Share|JOHNSTON',  
    'PJ RER Apts to Share|JOHNSTON',  
    'PJ RER Condo|WEST WARWICK',  
    'PJ RER Condo|WARWICK');  

foreach my $line (@data) {   
    chomp($line);  
    @fields = split(/\|/,$line);  
    if ($fields[0] eq "PJ RER Apts to Share"){  
        $city = "\u\L$fields[1]";
        $apts{$city}++;  

    }  
} 

@city_sort = sort (@city);  
print map {"$_ $apts{$_}\n";} sort(keys %apts);  
$count = @data; 
print "$count","\n"; 

Also, did you want a count of all listings or just those you want to match. If it is the later change the next to the last line to:

$count = keys %apts;
HerbN