why this snippet:
'He said "Hello"' =~ /(\w)\1/
matches "ll"
? I thought that the \w
part matches "H"
, and hence \1
refers to "H"
, thus nothing should be matched? but why this result?
why this snippet:
'He said "Hello"' =~ /(\w)\1/
matches "ll"
? I thought that the \w
part matches "H"
, and hence \1
refers to "H"
, thus nothing should be matched? but why this result?
and you're right, nothing was matched at that position. then regex went further and found match, which it returned to you.
\w
is of course matches any word character, not just 'H'
.
I thought that the \w part matches "H"
\w
matches any alphanumerical character (and underscore). It also happens to match H
but that’s not terribly interesting since the regular expression then goes on to say that this has to be matched twice – which H
can’t in your text (since it doesn’t appear twice consecutively), and neither is any of the other characters, just l
. So the regular expression matches ll
.
You're thinking of /^(\w)\1/
. The caret symbol specifies that the match must start at the beginning of the line. Without that, the match can start anywhere in the string (it will find the first match).