views:

121

answers:

2

I have an array of hashes, all with the same set of keys, e.g.:

my $aoa= [
 {NAME=>'Dave', AGE=>12, SEX=>'M', ID=>123456, NATIONALITY=>'Swedish'},
 {NAME=>'Susan', AGE=>36, SEX=>'F', ID=>543210, NATIONALITY=>'Swedish'},
 {NAME=>'Bart', AGE=>120, SEX=>'M', ID=>987654, NATIONALITY=>'British'},
]

I would like to write a subroutine that will convert this into a hash of hashes using a given key hierarchy:

my $key_hierarchy_a = ['SEX', 'NATIONALITY'];
aoh_to_hoh ($aoa, $key_hierarchy_a) = @_;
 ...
}

will return

{M=>
  {Swedish=>{{NAME=>'Dave', AGE=>12, ID=>123456}},
   British=>{{NAME=>'Bart', AGE=>120, ID=>987654}}}, 
 F=>
  {Swedish=>{{NAME=>'Susan', AGE=>36,  ID=>543210}}
}

Note this not only creates the correct key hierarchy but also remove the now redundant keys.

I'm getting stuck at the point where I need to create the new, most inner hash in its correct hierarchical location.

The problem is I don't know the "depth" (i.e. the number of keys). If I has a constant number, I could do something like:

%h{$inner_hash{$PRIMARY_KEY}}{$inner_hash{$SECONDARY_KEY}}{...} = filter_copy($inner_hash,[$PRIMARY_KEY,$SECONDARY_KEY])

so perhaps I can write a loop that will add one level at a time, remove that key from the hash, than add the remaining hash to the "current" location, but it's a bit cumbersome and also I'm not sure how to keep a 'location' in a hash of hashes...

+6  A: 
use Data::Dumper;

my $aoa= [
 {NAME=>'Dave', AGE=>12, SEX=>'M', ID=>123456, NATIONALITY=>'Swedish'},
 {NAME=>'Susan', AGE=>36, SEX=>'F', ID=>543210, NATIONALITY=>'Swedish'},
 {NAME=>'Bart', AGE=>120, SEX=>'M', ID=>987654, NATIONALITY=>'British'},
];

sub aoh_to_hoh {
  my ($aoa, $key_hierarchy_a) = @_;
  my $result = {};
  my $last_key = $key_hierarchy_a->[-1];
  foreach my $orig_element (@$aoa) {
    my $cur = $result;
    # song and dance to clone an element
    my %element = %$orig_element;
    foreach my $key (@$key_hierarchy_a) {
      my $value = delete $element{$key};
      if ($key eq $last_key) {
        $cur->{$value} ||= [];
        push @{$cur->{$value}}, \%element;
      } else {
        $cur->{$value} ||= {};
        $cur = $cur->{$value};
      }
    }
  }
  return $result;
}

my $key_hierarchy_a = ['SEX', 'NATIONALITY'];
print Dumper(aoh_to_hoh($aoa, $key_hierarchy_a));

As per @FM's comment, you really want an extra array level in there.

The output:

$VAR1 = {
          'F' => {
                   'Swedish' => [
                                  {
                                    'ID' => 543210,
                                    'NAME' => 'Susan',
                                    'AGE' => 36
                                  }
                                ]
                 },
          'M' => {
                   'British' => [
                                  {
                                    'ID' => 987654,
                                    'NAME' => 'Bart',
                                    'AGE' => 120
                                  }
                                ],
                   'Swedish' => [
                                  {
                                    'ID' => 123456,
                                    'NAME' => 'Dave',
                                    'AGE' => 12
                                  }
                                ]
                 }
        };

EDIT: Oh, BTW - if anyone knows how to elegantly clone contents of a reference, please teach. Thanks!

EDIT EDIT: @FM helped. All better now :D

Amadan
Doh, I'm an idiot. I'll go fix it now... :)
Amadan
Storable::dclone can be used for generically copying the contents of a deep data structure.
Ether
@Ether: Thanks for that!
Amadan
+2  A: 

As you've experienced, writing code to create hash structures of arbitrary depth is a bit tricky. And the code to access such structures is equally tricky. Which makes one wonder: Do you really want to do this?

A simpler approach might be to put the original information in a database. As long as the keys you care about are indexed, the DB engine will be able to retrieve rows of interest very quickly: Give me all persons where SEX = female and NATIONALITY = Swedish. Now that sounds promising!

You might also find this loosely related question of interest.

FM
Perhaps you're right. I should take a look into databases in Perl sometime.
David B