tags:

views:

85

answers:

3

Lets say I have the following Perl hash:

%hash = ( 
    'A' => { 
        'B' => ['C', 'D', 'E'], 
        'F' => { 'G' => [], 'H' => [] }, 
        'I' => [] 
        } );

and I'd like to get rid of the []'s to get the hash result below:

%hash = ( 
    'A' => [ 
       'B' => ['C', 'D', 'E'], 
       'F' => [ 'G', 'H', 'I' ] 
        ] 
    )

(I hope I got my {} and [] balanced, my apologies if not, but) essentially I'd like to make it so that no empty arrays/ref's exist. I'm sure this is possible/simple, but I'm not sure whether delete() will work, or if there's a better method or a Perl module out there. Can someone steer me in the right direction?

+4  A: 

It appears like your data might be nested arbitrarily, and you want to walk through it recursively, rewriting certain patterns to others. For that, I'd be using Data::Visitor.

use Data::Visitor::Callback;
use List::MoreUtils 'all';

my $visitor = Data::Visitor::Callback->new(
    hash => sub {
        my ($self, $href) = @_;

        # fold hashrefs with only empty arrayrefs as values into arrayrefs
        if (all { ref $_ eq 'ARRAY' && !@{ $_ } } values %{ $href }) {
            return [ keys %{ $href } ];
        }

        # strip k/v pairs with an empty arrayref as a value
        return {
            map {
                $_ => $href->{$_}
            } grep {
                ref $href->{$_} ne 'ARRAY' || @{ $href->{$_} }
            } keys %{ $href }
        };
    },
);

my %new_hash = %{ $visitor->visit(\%hash) };

This just illustrates the basic approach I'd use, and happens to work for the example input you gave. It might need various tweaks depending on what you want to do in the corner-cases pointed out in the other comments.

rafl
Thanks rafl! This was exactly what I needed!
Nick
+1  A: 

[This should be a comment, but I need the formatting.]

Your question is puzzling. (1) By what principle does the I key (from the original hash) end up inside the list for the F key (in the expected hash)? (2) And what should happen if F were to contain stuff besides the empty array refs (see my addition to the original hash)?

my %hash_orig = (
    'A' => {
        'B' => ['C', 'D', 'E'],
        'F' => {
            'G' => [],
            'H' => [],
            'Z' => ['FOO', 'BAR'],  # Not in the OP's original.
        },
        'I' => [],
    },
);

my %hash_expected = (
    'A' => [
        'B' => ['C', 'D', 'E'],
        'F' => [ 'G', 'H', 'I'],    # Where should the Z info go?
    ],
);
FM
+1  A: 

Walking a hash (tree, whatever) is a technique that any programmer should know. rafl uses a visitor module, but in some cases I think the cure is almost worse than the disease.

Is your expected output what you intended? It seems different that what you said in the text, as FM says. I use his hash in my example.

It's pretty easy if you use a queue. You start with the top-level hash. Every time you run into a hash ref, you add it to the queue. When you run into an array ref, you check that it has values and delete that key if it doesn't. Everything else you leave alone:

#!perl
use strict;
use warnings;
use 5.010;

my %hash = ( # From FM
    'A' => {
        'B' => ['C', 'D', 'E'],
        'F' => {
            'G' => [],
            'H' => [],
            'Z' => ['FOO', 'BAR'],  # Not in the OP's original.
        },
        'I' => [],
    },
);

my @queue = ( \%hash );

while( my $ref = shift @queue ) {
     next unless ref $ref eq ref {};

     KEY: foreach my $key ( keys %$ref ) {
        if( ref $ref->{$key} eq ref {} ) {
            push @queue, $ref->{$key};
            next KEY;
            }
        elsif( ref $ref->{$key} eq ref [] ) {
            delete $ref->{$key} if @{$ref->{$key}} == 0;
            }
        }
     }

use Data::Dumper;
print Dumper( \%hash );

My output is:

$VAR1 = {
          'A' => {
                   'F' => {
                            'Z' => [
                                     'FOO',
                                     'BAR'
                                   ]
                          },
                   'B' => [
                            'C',
                            'D',
                            'E'
                          ]
                 }
        };

That output sounds more like what you are asking for, rather than the reorganization that you specify. Can you clarify the output?

brian d foy
+1. That line `my @queue=(\%hash)` was very clever. Learned something...
drewk
sorry, it didn't preserve G, H, and I under A and F, but thanks for your answer
Nick
Can you expound on the rules for preserving those elements? It's not hard to do, but I always worry about working from a solitary example.
brian d foy