views:

225

answers:

3

I am reviewing some ClearCase triggers written in Perl. I have noticed that in some regular expressions, variables are passed either straighforwardly or with their names in curly brackets.

For example, I have the following line of code in a trigger:

if ($baseline !~ /^${component}_(|.*_)$phase\.\d+(|[a-z]|-\d+|${automateddigit})$/ &&
  $baseline !~ /^${project_root}_$phase\.\d+(|[a-z]|-\d+|${automateddigit})$/)

$component, $phase, $automateddigit, $project_root are all variables.

Why are some passed as $variable and other passed as ${variable} in the regular expression?

Does it come from how they are initialised?

Here are the line of code initialising them:

($project = $ENV{CLEARCASE_PROJECT}) =~ s/\@.*$//;
($component = $ENV{CLEARCASE_COMPONENT}) =~ s/\@.*$//;

($project_root, $phase) = ($project =~ /^(.*)_(R\d+.*)$/);

exit(0) if (! $phase);

$phase .= ".0" if ($phase =~ /^R\d+$/);

$automateddigit = '';

$istream = `cleartool desc -fmt "%[istream]p" project:$ENV{CLEARCASE_PROJECT}`;

$componentlist = `cleartool desc -fmt "%[components]Cp" stream:$ENV{CLEARCASE_STREAM}`;
$componentsnbr = split(',', $componentlist);

if ($componentsnbr > 1) {
    $automateddigit .= '\\.\\d+';
}
+7  A: 

If you pass the variable as ${name}, this explicitly delimits where the end of the variable name is, and where the rest of the quoted string begins. For example, in your code:

if ($baseline !~ /^${component}_(|.*_)$phase\.\d+(|[a-z]|-\d+|${automateddigit})$/ &&

Without the {} delimiters:

if ($baseline !~ /^$component_(|.*_)$phase\.\d+(|[a-z]|-\d+|${automateddigit})$/ &&

Note that the variable $component (you can refer to it either way) will be misinterpreted as $component_ because of the trailing underscore in the regular expression.

1800 INFORMATION
OK, so I better use ${variable} all the time, then. No ?
Thomas Corriol
I suggest only where necessary - you should be aware of the names of your variables so if they would cause a conflict then do it. Doing it everywhere can be noisy
1800 INFORMATION
OK, Thanks ! I got my answers. :-)
Thomas Corriol
+1  A: 

First, this is called string interpolation. One good reason to use it in this case is to prevent $project_root from being interpreted as $project_root_ (note the trailing underscore). It makes explicit the variable name, instead of leaving it to the more-complicated interpolation rules.

See perldata for more on interpolation, and perlre and perlop on peculiarities of interpolation within regular expression operators.

Anonymous
+1  A: 

As mentioned above, it's there to delimit variables names. Too many curly braces makes already difficult regular expressions even harder. Curly braces have their own regexp uses (to limit the number of times a pattern matches). I would recommend using the regexp /x modifier, and rewrite your regexp as:

if ($baseline !~ /^$component    # Start with $component
                   _             # then an underscore
                   (|.*_)        # Then nothing, or anything followed by an underscore
                   $phase        # ...
                   \.\d+         # ...
                   (|            # Then optionally:
                      [a-z]|       # lower alpha
                      -\d+|        # or ...
                      $automateddigit
                   )
                   $/x &&
    $baseline !~ /^$project_root
                   _
                   $phase
                   \.\d+
                   (|
                     [a-z]|
                     -\d+|
                     $automateddigit
                   )$/x)
Craig Lewis