ansaurus

Question

What's the best Perl practice for returning hashes from functions?

Answer 1

A:

Uh... "passing hashes can only be done by reference"?

sub foo(%) {
    my %hash = @_;
    do_stuff_with(%hash);
}

my %hash = (a => 1, b => 2);
foo(%hash);

What am I missing?

I would say that if the issue is that you need to have multiple outputs from a function, it's better as a general practice to output a data structure, probably a hash, that holds everything you need to send out rather than taking modifiable references as arguments.

chaos 2009-07-07 21:23:43

I'm pretty sure when you call it like this, it eats any arguments after the hash and addes them to the hash itself.

Eric 2009-07-07 21:25:27

How does this answer the question? It may be a comment, not the answer.

Igor Krivokon 2009-07-07 21:25:33

prototypes are a bad practice anyway.

Dan Littlejohn 2009-07-07 21:27:11

@Eric: Nope, the `(%)` prototype gives it special powers. If you changed it to `sub foo(\%\@)`, you could call `foo(%hash, @array)` and pull out a hashref and arrayref within the subroutine.

ephemient 2009-07-07 21:27:55

No, they're not.

jrockway 2009-07-07 21:28:07

Answer 2

+14 A:

Just return the reference. There is no need to dereference the whole hash like you are doing in your examples:

my $result = some_function_that_returns_a_hashref;
say "Foo is ", $result->{foo};
say $_, " => ", $result->{$_} for keys %$result;

etc.

I have never seen anyone pass in empty references to hold the result. This is Perl, not C.

jrockway 2009-07-07 21:26:44

Answer 3

+4 A:

The first one is better:

my ($ref_array,$ref_hash) = $this->getData('input');

The reasons are:

in the second case, getData() needs to check the data structures to make sure they are empty
you have freedom to return undef as a special value
it looks more Perl-idiomatic.

Note: the lines

@array = @{$ref_array};
%hash = %{$ref_hash};

are questionable, since you shallow-copy the whole data structures here. You can use references everywhere where you need array/hash, using -> operator for convenience.

Igor Krivokon 2009-07-07 21:31:20

The reason I have done the dereferencing in the past is to make it more readable so that you don't have to try and figure out what kind of ref it is returning.

Dan Littlejohn 2009-07-07 21:41:56

in that case, I recommend you do `local (*array, *ref) = $this->getData('input')` instead. that way, you avoid the copy and get the `@array` and `%hash` variables as aliases to the returned refs. But beware, you _will_ lose the sigil on member access anyway, because you'll get `$array[1]` and `$hash{a}`. _I_ would stick with `$array_ref->[1]` and `$hash_ref->{a}`...

Massa 2009-07-08 00:20:04

Answer 4

+8 A:

Trying to create copies by saying

my %hash = %{$ref_hash};

is even more dangerous than using the hashref. This is because it only creates a shallow copy. This will lead you to thinking it is okay to modify the hash, but if it contains references they will modify the original data structure. I find it better to just pass references and be careful, but if you really want to make sure you have a copy of the reference passed in you can say:

use Storable qw/dclone/;

my %hash = %{dclone $ref_hash};

Chas. Owens 2009-07-07 21:33:33

Answer 5

+1 A:

My personal preference for sub interfaces:

If the routine has 0-3 arguments, they may be passed in list form: foo( 'a', 12, [1,2,3] );
Otherwise pass a list of name value pairs. foo( one => 'a', two => 12, three => [1,2,3] );
If the routine has or may have more than one argument seriously consider using name/value pairs.

Passing in references increases the risk of inadvertent data modification.

On returns I generally prefer to return a list of results rather than an array or hash reference.

I return hash or array refs when it will make a noticeable improvement in speed or memory consumption (ie BIG structures), or when a complex data structure is involved.

Returning references when not needed deprives one of the ability to take advantage of Perl's nice list handling features and exposes one to the dangers of inadvertent modification of data.

In particular, I find it useful to assign a list of results into an array and return the array, which provides the contextual return behaviors of an array to my subs.

For the case of passing in two hashes I would do something like:

my $foo = foo( hash1 => \%hash1, hash2 => \%hash2 ); # gets number of items returned
my @foo = foo( hash1 => \%hash1, hash2 => \%hash2 ); # gets items returned

sub foo {
   my %arg = @_;

   # do stuff

   return @results;
}

daotoad 2009-07-07 22:13:42

Answer 6

+3 A:

If it's getting complicated enough that both the callsite and the called function are paying for it (because you have to think/write more every time you use it), why not just use an object?

my $results = $this->getData('input');

$results->key_value_thingies;
$results->listy_thingies;

If making an object is "too complicated" then start using Moose so that it no longer is.

nothingmuch 2009-07-08 21:53:47

Definitely good advice.

jrockway 2009-07-08 21:59:11

meh, objects in perl has always been kind of half ass. The idea is sound, but trying to use that for a large dataset for speed does not work.

Dan Littlejohn 2009-07-11 05:24:34

if you really think that is slow, use C++ instead. Perl is slow because of many other reasons, instantiating an object is not one of them.

nothingmuch 2009-07-13 22:55:47

Answer 7

A:

I originally posted this to another question, and then someone pointed to this as a "related post", so I'll post it here to for my take on the subject, assuming people will encounter it in the future.

I'm going to contradict the Accepted Answer and say that I prefer to have my data returned as a plain hash (well, as an even-sized list which is likely to be interpreted as a hash). I work in an environment where we tend to do things like the following code snippet, and it's much easier to combine and sort and slice and dice when you don't have to dereference every other line. (It's also nice to know that someone can't damage your hashref because you passed the entire thing by value -- though someone pointed out that if your hash contains more than simple scalars it's not so simple.)

my %filtered_config_slice = 
   hashgrep { $a !~ /^apparent_/ && defined $b } (
   map { $_->build_config_slice(%some_params, some_other => 'param') } 
   ($self->partial_config_strategies, $other_config_strategy)
);

This approximates something that my code might do: building a configuration for an object based on various configuration strategy objects (some of which the object knows about inherently, plus some extra guy) and then filters out some of them as irrelevant.

(Yes, we have nice tools like hashgrep and hashmap and lkeys that do useful things to hashes. $a and $b get set to the key and the value of each item in the list, respectively). (Yes, we have people who can program at this level. Hiring is obnoxious, but we have a quality product.)

If you don't intend to do anything resembling functional programming like this, or if you need more performance (have you profiled?) then sure, use hashrefs.

fennec 2010-01-06 04:35:21

ansaurus

tags:

views:

answers:

What's the best Perl practice for returning hashes from functions?

related questions