views:

7733

answers:

8

If you have a hash (or reference to a hash) in perl with many dimensions and you want to iterate across all values, what's the best way to do it. In other words, if we have $f->{$x}{$y}, I want something like

foreach ($x, $y) (deep_keys %{$f})
{
}

instead of

foreach $x (keys %f) 
    {
    foreach $y (keys %{$f->{$x}) 
    {
    }
}
+1  A: 

It's easy enough if all you want to do is operate on values, but if you want to operate on keys, you need specifications of how levels will be recoverable.

a. For instance, you could specify keys as "$level1_key.$level2_key.$level3_key"--or any separator, representing the levels.

b. Or you could have a list of keys.

I recommend the latter.

  • Level can be understood by @$key_stack

  • and the most local key is $key_stack->[-1].

  • The path can be reconstructed by: join( '.', @$key\_stack )

Code:

use constant EMPTY_ARRAY => [];
use strict;    
use Scalar::Util qw<reftype>;

sub deep_keys (\%) { 
    sub deeper_keys { 
        my ( $key_ref, $hash_ref ) = @_;
        return [ $key_ref, $hash_ref ] if reftype( $hash_ref ) ne 'HASH';
        my @results;

        while ( my ( $key, $value ) = each %$hash_ref ) { 
            my $k = [ @{ $key_ref || EMPTY_ARRAY }, $key ];
            push @results, deeper_keys( $k, $value );
        }
        return @results;
    }

    return deeper_keys( undef, shift );
}

foreach my $kv_pair ( deep_keys %$f ) { 
    my ( $key_stack, $value ) = @_;
    ...
}

This has been tested in Perl 5.10.

Axeman
+1  A: 

Keep in mind that Perl lists and hashes do not have dimensions and so cannot be multidimensional. What you can have is a hash item that is set to reference another hash or list. This can be used to create fake multidimensional structures.

Once you realize this, things become easy. For example:

sub f($) {
  my $x = shift;
  if( ref $x eq 'HASH' ) {
    foreach( values %$x ) {
      f($_);
    }
  } elsif( ref $x eq 'ARRAY' ) {
    foreach( @$x ) {
      f($_);
    }
  }
}

Add whatever else needs to be done besides traversing the structure, of course.

One nifty way to do what you need is to pass a code reference to be called from inside f. By using sub prototyping you could even make the calls look like Perl's grep and map functions.

Zan Lynx
+6  A: 

Here's an option. This works for arbitrarily deep hashes:

sub deep_keys_foreach
{
    my ($hashref, $code, $args) = @_;

    while (my ($k, $v) = each(%$hashref)) {
        my @newargs = defined($args) ? @$args : ();
        push(@newargs, $k);
        if (ref($v) eq 'HASH') {
            deep_keys_foreach($v, $code, \@newargs);
        }
        else {
            $code->(@newargs);
        }
    }
}

deep_keys_foreach($f, sub {
    my ($k1, $k2) = @_;
    print "inside deep_keys, k1=$k1, k2=$k2\n";
});
bmdhacks
I love this! Thanks for the concise and workable answer that works out of the box.
David Nehme
A: 

If you are working with tree data going more than two levels deep, and you find yourself wanting to walk that tree, you should first consider that you are going to make a lot of extra work for yourself if you plan on reimplementing everything you need to do manually on hashes of hashes of hashes when there are a lot of good alternatives available (search CPAN for "Tree").

Not knowing what your data requirements actually are, I'm going to blindly point you at a tutorial for Tree::DAG_Node to get you started.

That said, Axeman is correct, a hashwalk is most easily done with recursion. Here's an example to get you started if you feel you absolutely must solve your problem with hashes of hashes of hashes:

#!/usr/bin/perl
use strict;
use warnings;

my %hash = (
    "toplevel-1" => 
    { 
        "sublevel1a"  => "value-1a",
        "sublevel1b"  => "value-1b"
    },
    "toplevel-2" =>
    {
        "sublevel1c" => 
        {
            "value-1c.1" => "replacement-1c.1",
            "value-1c.2" => "replacement-1c.2"
        },
        "sublevel1d" => "value-1d"
    }
);

hashwalk( \%hash );

sub hashwalk
{
    my ($element) = @_;
    if( ref($element) =~ /HASH/ )
    {
        foreach my $key (keys %$element)
        {
            print $key," => \n";
            hashwalk($$element{$key});
        }
    }
    else
    {
        print $element,"\n";
    }
}

It will output:

toplevel-2 => 
sublevel1d => 
value-1d
sublevel1c => 
value-1c.2 => 
replacement-1c.2
value-1c.1 => 
replacement-1c.1
toplevel-1 => 
sublevel1a => 
value-1a
sublevel1b => 
value-1b

Note that you CAN NOT predict in what order the hash elements will be traversed unless you tie the hash via Tie::IxHash or similar — again, if you're going to go through that much work, I recommend a tree module.

Zed
+1  A: 

There's no way to get the semantics you describe because foreach iterates over a list one element at a time. You'd have to have deep_keys return a LoL (list of lists) instead. Even that doesn't work in the general case of an arbitrary data structure. There could be varying levels of sub-hashes, some of the levels could be ARRAY refs, etc.

The Perlish way of doing this would be to write a function that can walk an arbitrary data structure and apply a callback at each "leaf" (that is, non-reference value). bmdhacks' answer is a starting point. The exact function would vary depending one what you wanted to do at each level. It's pretty straightforward if all you care about is the leaf values. Things get more complicated if you care about the keys, indices, etc. that got you to the leaf.

Michael Carman
+2  A: 

You can also fudge multi-dimensional arrays if you always have all of the key values, or you just don't need to access the individual levels as separate arrays:

$arr{"foo",1} = "one";
$arr{"bar",2} = "two";

while(($key, $value) = each(%arr))
{
    @keyValues = split($;, $key);
    print "key = [", join(",", @keyValues), "] : value = [", $value, "]\n";
}

This uses the subscript separator "$;" as the separator for multiple values in the key.

Greg Cottman
+10  A: 

Stage one: don't reinvent the wheel :)

A quick search on CPAN throws up the incredibly useful Data::Walk. Define a subroutine to process each node, and you're sorted

use Data::Walk;

my $data = { # some complex hash/array mess };

sub process {
   print "current node $_\n";
}

walk \&process, $data;

And Bob's your uncle. Note that if you want to pass it a hash to walk, you'll need to pass a reference to it (see perldoc perlref), as follows (otherwise it'll try and process your hash keys as well!):

walk \&process, \%hash;

For a more comprehensive solution (but harder to find at first glance in CPAN), use Data::Visitor::Callback or its parent module - this has the advantage of giving you finer control of what you do, and (just for extra street cred) is written using Moose.

Penfold
Axeman
You can filter the list of things to process with the preprocess => sub {} option arg - see the docs for more details.
Penfold
+5  A: 

This sounds to me as if Data::Diver or Data::Visitor are good approaches for you.

Corion