tags:

views:

89

answers:

3
+2  Q: 

Array of hashes

Hi , In perl , i have an array of hashes like

0  HASH(0x98335e0)
   'title' => 1177
   'author' => 'ABC'
   'quantity' => '-100'


1  HASH(0x832a9f0)
   'title' => 1177
   'author' => 'ABC'
   'quantity' => '100'

2  HASH(0x98335e0)
   'title' => 1127
   'author' => 'DEF'
   'quantity' => '5100'


3  HASH(0x832a9f0)
   'title' => 1277
   'author' => 'XYZ'
   'quantity' => '1030'

Now I need to accumulate the quantity where title and author are same. In the above structure for hash with title = 1177 and author ='ABC' quantity can be accumulated into one and the entire structure should looks like below

0  HASH(0x98335e0)
   'title' => 1177
   'author' => 'ABC'
   'quantity' => 0

1  HASH(0x98335e0)
   'title' => 1127
   'author' => 'DEF'
   'quantity' => '5100'

2  HASH(0x832a9f0)
   'title' => 1277
   'author' => 'XYZ'
   'quantity' => '1030'

What is the best way i can do this accumulation so that it is optimised? Number of array elements can be very large. I dont mind adding an extra key to the hash to aid the same , but i dont want n lookups . Kindly advise

+4  A: 
my %sum;
for (@a) {
  $sum{ $_->{author} }{ $_->{title} } += $_->{quantity};
}

my @accumulated;
foreach my $author (keys %sum) {
  foreach my $title (keys %{ $sum{$author} }) {
    push @accumulated => { title    => $title,
                           author   => $author,
                           quantity => $sum{$author}{$title},
                         };
  }
}

Not sure whether map makes it look nicer:

my @accumulated =
  map {
    my $author = $_;
    map { author   => $author,
          title    => $_,
          quantity => $sum{$author}{$_},
        },
      keys %{ $sum{$author} };
  }
  keys %sum;
Greg Bacon
This sample is just itching for some map/grep love
Daenyth
@Daenyth Usually yes, but it doesn't look so nice in this case.
Greg Bacon
+1  A: 
Axeman
A: 

I think it is important to step back and consider the source of the data. If the data are coming from a database, then you should write the SQL query so that it gives you one row for each author/title combination with the total quantity in the quantity field. If you are reading the data from a file, then you should either read it directly into a hash or use Tie::IxHash if order is important.

Once you have the data in an array of hashrefs like you do, you will have to create an auxiliary data structure and do a whole bunch of lookups, the cost of which may well dominate the running time of your program (not in a way it matters if it is run for 15 minutes once a day) and you might run into memory issues.

Sinan Ünür