views:

949

answers:

4

Here is some simple Perl to count the number of times a value occurs in an array. This runs without any warnings.

use warnings;
use strict;

my @data = qw(1 1 2 3 4 5 5 5 9);
my %histogram;
foreach (@data)
{
    $histogram1{$_}++;
}

When the loop body is changed to

$histogram{$_} = $histogram{$_} + 1;

Perl warns "Use of uninitialized value in addition".

What is going on under the hood? Why is the value initialized when supplied as an operand to the ++ operator and uninitialized with the + operator?

+13  A: 

The + operator evaluates both the form to the left and the form to the right of it, then returns the sum of both. The hash call evaluation does not see any special context.

The ++ operator has some special magic built in. Quoting from the perlop manpage, regarding the ++ operator:

"undef" is always treated as numeric, and in particular is changed to 0 before incrementing (so that a post-increment of an undef value will return 0 rather than "undef").

edit: To elaborate on the difference, ++ changes the value in place, while + just takes its arguments as input. When + sees an undefined value, typically something has gone wrong, but for ++, your hash manipulation example is very typical -- the user wants to treat undef as 0, instead of having to check and initialize everytime. So it seems that it makes sense to treat these operators this way.

Svante
+7  A: 

It's not that Perl necessarily initializes values, but that it doesn't always warn about them. Don't try to think about a rule for this because you'll always find exceptions, and just when you think you have it figured out, the next version of Perl will change the warnings on you.

In this case, as Harleqin said, the auto-increment operators have a special case.

brian d foy
A: 

As Brian mentioned: it still does it, it just warns you. Warnings tell you about certain manipulations with effects you might not have intended.

You are specifically asking for the value of $histogram{$_}, adding 1 to it and then assigning it to the same slot. It's the same way that I wouldn't expect autovivification to work here:

my $hash_ref = $hash_for{$key_level_1};
$hash_ref->{$key_level_2} = $value;

as it does here:

$hash_for{$key_level_1}{$key_level_2} = $value;

Magic probably does not work like optimization. And optimizing compiler would notice that a = a + 1 is the same thing as a++ so that were there an increment operator in the assembly language, it could use that optimized instruction instead of pretending that it needed to preserve the first value, and then overwriting it because it isn't actually needed.

Optimization is extra scrutiny and overhead once for improved performance every run. But there is no guarantee in a dynamic language that you aren't adding overhead at the same rate you would otherwise be trying to reduce it.

Axeman
While warnings do not directly break the build, they should be heeded nevertheless.
Svante
+7  A: 

Certain operators deliberately omit the "uninitialized" warning for your convenience because they are commonly used in situations where a 0 or "" default value for the left or only operand makes sense.

These are: ++ and -- (either pre or post), +=, -=, .=, |=, ^=, &&=, ||=.

Note that some of these erroneously give the warning when used on a tied variable: see the tests marked TODO in http://perl5.git.perl.org/perl.git/blob/HEAD:/t/op/assignwarn.t.

ysth

related questions