tags:

views:

150

answers:

2

Can anyone please suggest me the meaning of below Perl regex:

$_ =~ s/^([^,]+,)ab.([^,]+,)(?:[^,]+,)/$1$2/;

What is the role of $1 and $2 and how these variables are defined?

+6  A: 

The $1 and $2 variables are match variables.

In a Regex, when something is surrounded by parentheses, that part of the regex (if it matches) gets stored in a $n variable.

So for example: /(\d+),(\w+)/ Would match something like 123,xyz with 123 being assigned to $1, and xyz being assigned in $2. They are assigned in the order that they appear in the regex.

Note that a "?:" after the leading parentheses i.e. (?: ...), tells the regex to not save that matched value in a $n variable.

Here is example code that demonstrates what your regex is doing using Perl. Note that it appears you're missing the leading s (for substitution) on the regex within your question.

File: test.pl

#!/usr/bin/perl

print $ARGV[0], "\n";
$ARGV[0] =~ s/^([^,]+,)ab.([^,]+,)(?:[^,]+,)/$1$2/;
print "\$1 = $1\n";
print "\$2 = $2\n";
print $ARGV[0], "\n";

Execute:

%> ./test.pl str1,ab.str2,str3,str4,
str1,ab.str2,str3,str4,
$1 = str1,
$2 = str2,
str1,str2,str4,
RC
Note that there doesn’t need to be a literal dot in the subject since the `.` represents any arbitrary character (except line break characters).
Gumbo
+9  A: 

The regular expression means:

^          Start of string
(          Start of capturing group -> $1
 [^,]+     Any character except ",", one or more times
 ,         A literal ","
)          End of capturing group
ab         "ab"
.          Any character
([^,]+,)   Same as previously -> $2
(?:        Start a non-capturing group
    [^,]+, Same as previously
)          End group

Each capturing group is placed into a numbered variable in the order they appear in the RegEx

Greg
Since the *dot all* modifier `s` is not set, `.` does only represent any character except the line break characters.
Gumbo