ansaurus

Question

What is better in Perl: an array of hash references or list of "flat" hashes?

Answer 1

+4 A:

I vastly prefer the former. It keeps one "packet" of data (size, name, volume) together and makes for much more readable code.

Thomas 2009-07-24 19:36:47

Answer 2

+9 A:

Instead of thinking in meaningless terms such as something, think and phrase the issue in concrete terms. In this case, you seem to be returning a list of objects that have name, size and volume attributes. When you think of it that way, there is no reason to even consider the second method.

You can think of optimizations later if you run into problems, but even if you do, you would probably gain more from Memoize than by exploding data structures.

One efficiency improvement I will recommend is to return a reference from this subroutine:

sub get_objects {
    my @ret;

    while ( 'some condition' ) {
        #  should I return this one?
        push @ret, {
            name => 'Foo',
            size => 10,
            volume => 100,
        };
    }

    return \@ret;
}

Sinan Ünür 2009-07-24 19:38:53

thank you. I though the concrete terms would just complicated quite simple questions. Btw, I want to return a list of HTML pages, that have title, link, html source and keywords.

Karel Bílek 2009-07-24 19:43:09

Answer 3

+1 A:

Keep your related data together. The only reason to create big parallel arrays is because you are forced to.

If you are concerned about speed and memory usage, you can use constant array indexes to access your named fields:

use constant { SIZE => 0, NAME => 1, VOLUME => 2, };

sub something {
  ...

  $ref->[SIZE]   = 10;
  $ref->[NAME]   = "Foo";
  $ref->[VOLUME] = 100;

  push @references, $ref;

  ...
  return @references;
}

I've also added some whitespace to make the code easier to read.

If I have a lot of parameters with validation rules and/or deep data structures, I tend to look to objects to simplify my code by tying the logic about the data, to the data. Of course, OOP exacts a speed penalty, but I have only rarely seen that become a problem.

For quick and dirty OOP, I use Class::Struct, which has many flaws. For situations where I need type checking, I use Moose or Mouse (when memory or startup speed is a big concern).

daotoad 2009-07-24 22:52:37

Answer 4

A:

Both ways might be useful for different problems. If you are always going to access all of the information together, just keep it together. For instance, in your case you want to track the name, title, and size of a web page. You're probably working with all three of those things at the same time, so keep them together as an array of hash references.

Other times, you might break data into different things that you use separately and want to look up independently of the other columns. In those cases, separate hashes might make sense.

brian d foy 2009-07-25 19:33:16

ansaurus

tags:

views:

answers:

What is better in Perl: an array of hash references or list of "flat" hashes?

related questions