I don't have an answer, but I do have different way of framing the issue, using simpler and perhaps more realistic regular expressions.
The first two examples behave exactly as I expect: .*
consumes the entire string and the regular expression returns a list with only one element. But the third regular expression returns a list with 2 elements.
use strict;
use warnings;
use Data::Dumper;
$_ = "foo";
print Dumper( [ /^(.*)/g ] ); # ('foo') As expected.
print Dumper( [ /.(.*)/g ] ); # ('oo') As expected.
print Dumper( [ /(.*)/g ] ); # ('foo', '') Why?
Many of the answers so far have emphasized that .*
will match anything. While true, this response does not go to the heart of the matter, which is this: Why is the regular expression engine still hunting after .*
has consumed the entire string? Under other circumstances (such as the first two examples), .*
does not throw in an extra empty string for good measure.
Update after the useful comments from Chas. Owens. The first evaluation of any of the three examples results in .*
matching the entire string. If we could intervene and call pos()
at that moment, the engine would indeed be at the end of the string (at least as we perceive the string; see the comments from Chas. for more insight on this). However, the /g
option tells Perl to try to match the entire regex again. That second attempt will fail for examples #1 and #2, and that failure will cause the engine to stop hunting. However, with regex #3, the engine will get another match: an empty string. Then the /g
option tells the engine to try the entire pattern yet again. Now there really is nothing left to match -- neither regular characters nor the trailing empty string -- so the process stops.