views:

650

answers:

4

I would like to properly understand hashes in Perl. I've had to use Perl intermittently for quite some time and mostly whenever I need to do it, it's mostly related to text processing.

And everytime, I have to deal with hashes, it gets messed up. I find the syntax very cryptic for hashes

A good explanation of hashes and hash references, their differences, when they are required etc. would be much appreciated.

+6  A: 

A hash is a basic data type in Perl. It uses keys to access its contents.

A hash ref is an abbreviation to a reference to a hash. References are scalars, that is simple values. It is a scalar value that contains essentially, a pointer to the actual hash itself.

Link: difference between hash and hash ref in perl - Ubuntu Forums

A difference is also in the syntax for deleting. Like C, perl works like this for Hashes:

delete $hash{$key};

and for Hash References

delete $hash_ref->{$key};

The Perl Hash Howto is a great resource to understand Hashes versus Hash with Hash References

There is also another link here that has more information on perl and references.

0A0D
+6  A: 

See perldoc perlreftut which is also accessible on your own computer's command line.

Sinan Ünür
+6  A: 

The following demonstrates how you can use a hash and a hash reference:

my %hash = (
    toy    => 'aeroplane',
    colour => 'blue',
);
print "I have an ", $hash{toy}, " which is coloured ", $hash{colour}, "\n";

my $hashref = \%hash;
print "I have an ", $hashref->{toy}, " which is coloured ", $hashref->{colour}, "\n";

Also see perldoc perldsc.

Alan Haggai Alavi
Personally, I found it confusing to see the term `%hash` to describe a hash. It might be a good idea to relabel your `%hash` as `%favorite` just to drive the point home. The print statement would then be something like `print "My favorite toy is the $favorite{toy}";`
Zaid
+11  A: 

A simple hash is close to an array. Their initializations even look similar. First the array:

@last_name = (
  "Ward",   "Cleaver",
  "Fred",   "Flintstone",
  "Archie", "Bunker"
);

Now let's represent the same information with a hash (aka associative array):

%last_name = (
  "Ward",   "Cleaver",
  "Fred",   "Flintstone",
  "Archie", "Bunker"
);

Although they have the same name, the array @last_name and the hash %last_name are completely independent.

With the array, if we want to know Archie's last name, we have to perform a linear search:

my $lname;
for (my $i = 0; $i < @last_name; $i += 2) {
  $lname = $last_name[$i+1] if $last_name[$i] eq "Archie";
}
print "Archie $lname\n";

With the hash, it's much more direct syntactically:

print "Archie $last_name{Archie}\n";

Say we want to represent information with only slightly richer structure:

  • Cleaver (last name)
    • Ward (first name)
    • June (spouse's first name)
  • Flintstone
    • Fred
    • Wilma
  • Bunker
    • Archie
    • Edith

Before references came along, flat key-value hashes were about the best we could do, but references allow

my %personal_info = (
    "Cleaver", {
        "FIRST",  "Ward",
        "SPOUSE", "June",
    },
    "Flintstone", {
        "FIRST",  "Fred",
        "SPOUSE", "Wilma",
    },
    "Bunker", {
        "FIRST",  "Archie",
        "SPOUSE", "Edith",
    },
);

Internally, the keys and values of %personal_info are all scalars, but the values are a special kind of scalar: hash references, created with {}. The references allow us to simulate "multi-dimensional" hashes. For example, we can get to Wilma via

$personal_info{Flintstone}->{SPOUSE}

Note that Perl allows us to omit arrows between subscripts, so the above is equivalent to

$personal_info{Flintstone}{SPOUSE}

That's a lot of typing if you want to know more about Fred, so you might grab a reference as sort of a cursor:

$fred = $personal_info{Flintstone};
print "Fred's wife is $fred->{SPOUSE}\n";

Because $fred in the snippet above is a hashref, the arrow is necessary. If you leave it out but wisely enabled use strict to help you catch these sorts of errors, the compiler will complain:

Global symbol "%fred" requires explicit package name at ...

Perl references are similar to pointers in C and C++, but they can never be null. Pointers in C and C++ require dereferencing and so do references in Perl.

C and C++ function parameters have pass-by-value semantics: they're just copies, so modifications don't get back to the caller. If you want to see the changes, you have to pass a pointer. You can get this effect with references in Perl:

sub add_barney {
    my($personal_info) = @_;

    $personal_info->{Rubble} = {
        FIRST  => "Barney",
        SPOUSE => "Betty",
    };
}

add_barney \%personal_info;

Without the backslash, add_barney would have gotten a copy that's thrown away as soon as the sub returns.

Note also the use of the "fat comma" (=>) above. It autoquotes the string on its left and makes hash initializations less syntactically noisy.

Greg Bacon