ansaurus

Question

How can I cleanly turn a nested Perl hash into a non-nested one?

Answer 1

+3 A:

First, I would use perl -c to make sure it compiles cleanly, which it does not. So, I'd add a trailing } to make it compile.

Then, I'd run it through perltidy to improve the code layout (indentation, etc.).

Then, I'd run perlcritic (in "harsh" mode) to automatically tell me what it thinks are bad practices. It complains that:

Subroutine does not end with "return"

Update: the OP essentially changed every line of code after I posted my Answer above, but I believe it still applies. It's not easy shooting at a moving target :)

toolic 2010-03-22 21:02:23

Answer 2

+9 A:

Your method is not best practices because it doesn't scale. What if the nested hash is six, ten levels deep? The repetition should tell you that a recursive routine is probably what you need.

sub flatten {
    my ($in, $out) = @_;
    for my $key (keys %$in) {
        my $value = $in->{$key};
        if ( defined $value && ref $value eq 'HASH' ) {
            flatten($value, $out);
        }
        else {
            $out->{$key} = $value;
        }
    }
}

Alternatively, good modern Perl style is to use CPAN wherever possible. Data::Traverse would do what you need:

use Data::Traverse;
sub flatten {
    my %hash = @_;
    my %flattened;
    traverse { $flattened{$a} = $b } \%hash;
    return %flattened;
}

As a final note, it is usually more efficient to pass hashes by reference to avoid them being expanded out into lists and then turned into hashes again.

rjh 2010-03-22 21:09:03

I'm a big fan of Data::Traverse.

friedo 2010-03-22 21:36:16

No suprise here, you are its principal author ;)

willert 2010-03-22 21:54:29

-1 your first block of code contains a number of typos, `$values` is never declared and you have mixed hash and reference to hash syntax

Eric Strom 2010-03-22 21:55:40

Data::Traverse was exactly what I was looking for. Thanks a lot!

knorv 2010-03-22 21:55:43

@Eric - Thanks, I've fixed those. Sometimes I rush writing code that explains the general idea of something, as I don't expect it to be used wholesale without some analysis first. But I was definitely sloppy, thanks for pointing it out.

rjh 2010-03-22 23:06:17

+1 for the general idea behind the second example. But this won't work: `my %hash = shift`. I assume the intent was to have the function receive a hash ref rather than a hash.

FM 2010-03-22 23:54:24

@FM should be `my %hash = @_;`. Obviously, my brain was not functioning that evening.

rjh 2010-03-23 09:57:19

Answer 3

+1 A:

Is it your intent to end up with a copy of the original hash or just a reordered result?

Your code starts with one hash (the original hash that is used by reference) and makes two copies %i and %hash.

The statement my %i=%{hashref} is not necessary. You are copying the entire hash to a new hash. In either case (whether you want a copy of not) you can use references to the original hash.

You are also losing data if your hash in the hash has the same value as the parent hash. Is this intended?

drewk 2010-03-22 21:11:06

Answer 4

+1 A:

There are a few problems with your approach that you need to figure out. First off, what happens in the event that there are two leaf nodes with the same key? Does the second clobber the first, is the second ignored, should the output contain a list of them? Here is one approach. First we construct a flat list of key value pairs using a recursive function to deal with other hash depths:

my %data = (
    foo  => {bar  => {baz  => 'hello'}},
    fizz => {buzz => {bing => 'world'}},
    fad  => {bad  => {baz  => 'clobber'}},
);


sub flatten {
    my $hash = shift;
    map {
        my  $value = $$hash{$_};
        ref $value eq 'HASH' 
            ? flatten($value) 
            : ($_ =>  $value)
    } keys %$hash
}

print join( ", " => flatten \%data), "\n";
# baz, clobber, bing, world, baz, hello

my %flat = flatten \%data;

print join( ", " => %flat ), "\n";
# baz, hello, bing, world          # lost (baz => clobber)

A fix could be something like this, which will create a hash of array refs containing all the values:

sub merge {
    my %out;
    while (@_) {
        my ($key, $value) = splice @_, 0, 2;
        push @{ $out{$key} }, $value
    }
    %out
}

my %better_flat = merge flatten \%data;

In production code, it would be faster to pass references between the functions, but I have omitted that here for clarity.

Eric Strom 2010-03-22 21:50:52

ansaurus

tags:

views:

answers:

How can I cleanly turn a nested Perl hash into a non-nested one?

related questions