tags:

views:

132

answers:

4

What does this line of Perl mean?

if (/ile.*= (\d*)/ || /ile.*=(\d*)/ ) {

I am particularly interested in what the "/ile" means, and why both sides of the || are identical.

+4  A: 

It's probably a really crude way of looking for a string that looks like one of these:

fileXXX=1234657
fileYYY= 123648

... the 'ile' is literally matching those three characters, and the two sides of the || aren't quite identical, there is a version with a space after = and one without.

Adam Bellaire
+9  A: 

The syntax /.../ contains a regular expression. The two sides of the || are subtly different - the second one has no space after the equals sign.

The first /.../ decodes as "match the letters 'i, l, e' then any character (.) any number of times (*), then an equals (=), then a space, then there is a capture (the brackets) that grabs zero or more digits (\d*).

The match is not tied to a Perl variable so it will be against the default scalar $_.

martin clayton
+1. You mention the capture and the `$_` variable.
pilcrow
the '$_' was the last piece of the puzzle. I couldn't figure out how it could be a regex because it had nothing to search, but that cleared it up. (and explained the rest of it too)
Thomas
+7  A: 

You can rewrite this as

if (/ile.*= ?(\d*)/) {

Use YAPE::Regex::Explain to understand what a given pattern matches.

#!/usr/bin/perl

use strict;
use warnings;

use YAPE::Regex::Explain;

print YAPE::Regex::Explain->new(qr/ile.*= ?(\d*)/)->explain;

Output:

The regular expression:

(?-imsx:ile.*= ?(\d*))

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ile                      'ile'
----------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  =                        '='
----------------------------------------------------------------------
   ?                       ' ' (optional (matching the most amount
                           possible))
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    \d*                      digits (0-9) (0 or more times (matching
                             the most amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------
Sinan Ünür
+1 for pedagogy :) The only thing missing was the instruction for the student to meditate on this.
Adam Bellaire
+2  A: 

In this context, the "/" character is not acting as a mathematical division operator or as some kind of prefix (like for Windows command-line options). Rather, "/" is the usual quoting character for enclosing regular expressions.

Everything between the pair of slashes forms a regular expression and does not denote any executable code, which brings us to what I suspect was another source of confusion by thinking that the "=" in there was some kind of assignment or equality operator. Inside a regular expression, it's just an ordinary character, as is the space character. Spaces are significant, and the presence or absence of one means that those two regular expressions are not identical. They can be consolidated into a single regular expression as demonstrated by Sinan's answer, using the "?" regular-expression operator.

Rob Kennedy