views:

876

answers:

4

I have Perl code similar to the following:

# -- start --

my $res;

# run query to fetch IPv6 resources
while( my $row = $org_ip6_res->fetchrow_arrayref )
{
    if( $row->[4] =~ /PA/ ) {
        $res->{ipv6}{pa}{$row->[2]}++;
    } elsif( $row->[4] eq 'PI' ) {
        $res->{ipv6}{pi}{$row->[2]}++;
    }
}

# -- stop --

At no point is $res ever set prior to iterating over the query results yet the code runs just fine.

When I put print statements before each value I get blanks in both cases but if the print statements come after the increment has been applied I get a value of >= 1 depending on how many IPv6 resources the organization has.

My question is, do I take this to mean an uninitialized hash key in Perl automatically has a value of zero?

Sorry if it comes across as a newbie question but I'm just not familiar with such a construct i.e. $hashref->{foo}->{bar}++ where a value has yet to be explicitly assigned to $hashref->{foo}->{bar}. Thanks in advance!

+13  A: 

The value is not automatically zero. The value is undefined initially. However, if you treat it like a number (eg, apply ++ to it), then Perl treats it like zero. If you treat it like a string (eg, apply . to it), then Perl treats it like an empty string.

From perldoc perlsyn, under 'Declarations':

The only things you need to declare in Perl are report formats and subroutines (and sometimes not even subroutines). A variable holds the undefined value ("undef") until it has been assigned a defined value, which is anything other than "undef". When used as a number, "undef" is treated as 0; when used as a string, it is treated as the empty string, ""; and when used as a reference that isn’t being assigned to, it is treated as an error.

Telemachus
+3  A: 

It's basically undefined, but treated as if it was zero when you increment it.

The term in Perl parlance is 'autovivified'.

What you probably want to do is use the exists keyword:

$res->{ipv6}{pa}{$row->[2]}++ if exists($res->{ipv6}{pa}{$row->[2]});
Robert S. Barnes
The exists keyword tests to see if the key is in the hash, not whether the value is undef. use Test::More tests => 4;my %h = ( 'a' );ok( exists $h{a} );ok( !defined $h{a} );ok( !exists $h{b} );ok( !defined $h{b} );
Axeman
I know. The point is that he needs to understand autovivification and the difference between a key that exists but isn't defined and an undefined key and how to deal with each - all of which he'll understand if he follows the link I provided and reads the documentation on the exists keyword.
Robert S. Barnes
+4  A: 

To elaborate on Telemachus' post, the uninitialized values will be undefined. The deep parts of the structure are autovivified. This is a handy feature where data structures are created for you automatically. Autovivification is great when you want it, but it can be a pain when you want to prevent it. There are many tutorials, articles and posts around the net on understanding autovivification.

So given an undefined $ref and $ref->{ipv6}{pa}{'foo'}++, $ref will be assigned a value of:

$ref = { 
     ipv6 => { 
          pa => { 
              foo => undef
          }
     }
};

Then the undef will be incremented, since undef numifies to 0, we get 0++ which is 1. For a final result of: ref->{ipv6}{pa}{'foo'} == 1.

If you have warnings enabled, (you do use warnings;, don't you?) you will get an "uninitialized value" warning when you operate on these undefined values. If it is the desired behavior to increment the unitialized value, then you can turn the desired group of warnings off over a limited part of your code:

use strict;
use warnings;
my $res;

// run query to fetch IPv6 resources
while( my $row = $org_ip6_res->fetchrow_arrayref )
{   no warnings 'uninitialized';
    if( $row->[4] =~ /PA/ ) {
        $res->{ipv6}{pa}{$row->[2]}++;
    } elsif( $row->[4] eq 'PI' ) {
        $res->{ipv6}{pi}{$row->[2]}++;
    }
}

You can find the warnings hierarchy in perllexwarn.

daotoad
The ++ and -- operators do not warn about the use of uninitialized values. Instead they silently convert undef to 0.
Michael Carman
Actually I recall seeing the same behaviour daotoad describes, of getting "uninitialised value" warnings on autoincrement. Do either of you know what was the latest Perl version where this happened?
j_random_hacker
@j_random - I just tested with Perl 5.8, and Michael Carman is correct. `perl -w -e '$foo{bar}++'` does not warn re uninitialized values. Neither does `perl -w -e '$foo{bar} += 2'`. `.=`, '-=` and '--' are fine as well. However, the other X= operators generate warnings.
daotoad
Thanks! It might have even been something weird like `$foo{bar}++` worked fine but not `++$foo{bar}`. I wish I could recall what the exact circumstances were...
j_random_hacker
+2  A: 

There's no such thing as an uninitialized hash key. The thing that can be uninitialized is the value for a particular key. A hash value is just a scalar value; it's no different than a variable like $foo.

There are a couple of different Perl features interacting in your example.

Initially $res is undefined (i.e. it has the value undef). When you use an uninitialized value as a hash reference (as in $res->{ipv6}...) Perl "autovivifies" it as one. That is, Perl creates an anonymous hash and replaces the value of undef with a reference to the new hash. This process repeats (silently) each time you use the resulting value as a reference.

Eventually, you autovivify your way to $res->{ipv6}{pa}{$row->[2]}, which is undefined. Remember that this is just a scalar value like $foo. The behavior is the same as saying

my $foo;
$foo++;

Perl does special things when you use undefined values. If you use them as a number, Perl converts them to 0. If you use them as a string, Perl converts them to '' (the empty string). Under most circumstances you'll get a "Use of uninitialized value..." warning if you have warnings enabled (which you should). The auto-increment operator (++) is a special case, though. For convenience, it silently converts the value from undef to 0 before incrementing it.

Michael Carman
Quite right, what I was actually referring to was an uninitialized <i>value</i> for a paricular key. :) Thanks also for the explanation
freakwincy