tags:

views:

3690

answers:

8

I need to pass a regex substitution as a variable:

sub proc {
    my $pattern = shift;
    my $txt = "foo baz";

    $txt =~ $pattern;
}

my $pattern = 's/foo/bar/';
proc($pattern);

This, of course, doesn't work. I tried eval'ing the substitution:

eval("$txt =~ $pattern;");

but that didn't work either. What horribly obvious thing am I missing here?

+5  A: 

Well, you can precompile the RE using the qr// operator. But you can't pass an operator (s///).

$pattern = qr/foo/;

print "match!\n" if $text =~ $pattern;

But if you have to pass the substitution operator, you are down to passing either code or strings:

proc('$text =~ s/foo/bar');

sub proc { 
   my $code = shift;

   ...

   eval $code;
}

or, code:

proc(sub {my $text = shift;  $text =~ s/foo/bar});

sub proc {
   my $code = shift;

   ...

   $code->("some text");
}
zigdon
+6  A: 
sub proc {
    my($match, $subst) = @_;
    my $txt = "foo baz";
    $txt =~ s/$match/$subst/;
    print "$txt\n";
}

my $matcher = qr/foo/;
my $sub_str = "bar";

proc($matcher, $sub_str);

This rather directly answers your question. You can do more - but when I used a qr// term instead of the $sub_str as a simple literal, then the expanded regex was substituted.

I recently needed to create a parser (test parser) for statements with some peculiar (dialect of) SQL types, recognizing lines such as this, splitting it into three type names:

input: datetime year to second,decimal(16,6), integer

The script I used to demo this used quoted regexes.

#!/bin/perl -w
use strict;
while (<>)
{
    chomp;
    print "Read: <$_>\n";
    my($r1) = qr%^input\s*:\s*%i;
    if ($_ =~ $r1)
    {
        print "Found input:\n";
        s%$r1%%;
        print "Residue: <$_>\n";
        my($r3) = qr%(?:year|month|day|hour|minute|second|fraction(?:\([1-5]\))?)%;
        my($r2) = qr%
                        (?:\s*,?\s*)?   # Commas and spaces
                        (
                            (?:money|numeric|decimal)(?:\(\d+(?:,\d+)?\))?   |
                            int(?:eger)?  |
                            smallint      |
                            datetime\s+$r3\s+to\s+$r3
                        )
                    %ix;
        while ($_ =~ m/$r2/)
        {
            print "Got type: <$1>\n";
            s/$r2//;
        }
        print "Residue 2: <$_>\n";
    }
    else
    {
        print "No match:\n";
    }
    print "Next?\n";
}

We can argue about the use of names like $r1, etc. But it did the job...it was not, and is not, production code.

Jonathan Leffler
+15  A: 

I need to pass a regex substitution as a variable

Do you? Why not pass a code reference? Example:

sub modify
{
  my($text, $code) = @_;
  $code->($text);
  return $text;
}

my $new_text = modify('foo baz', sub { $_[0] =~ s/foo/bar/ });

In general, when you want to pass "something that does something" to a subroutine ("a regex substitution" in the case of your question) the answer is to pass a reference to a piece of code. Higher Order Perl is a good book on the topic.

John Siracusa
This worked, and is closest to what I had in mind. However, the resulting code is a bit funky and convoluted for my tastes, which I generally take as a hint that it's time to rethink my overall approach.
ceo
+5  A: 
ephemient
+2  A: 

Perhaps you might re-think your approach.

You want to pass in to a function a regex substitution, probably because the function will be deriving the text to be operated upon from some other source (reading from a file, socket, etc.). But you're conflating regular expression with regular expression substitution.

In the expression, s/foo/bar/, you actually have a regular expression ("/foo/") and a substitution ("bar") that should replace what is matched by the expression. In the approaches you've tried thus far, you ran into problems trying to use eval, mainly because of the likelihood of special characters in the expression that either interfere with eval or get interpolated (i.e. gobbled up) in the process of evaluation.

So instead, try passing your routine two arguments: the expression and the substitution:

sub apply_regex {
    my $regex = shift;
    my $subst = shift || ''; # No subst string will mean matches are "deleted"

    # some setup and processing happens...

    # time to make use of the regex that was passed in:
    while (defined($_ = <$some_filehandle>)) {
        s/$regex/$subst/g; # You can decide if you want to use /g etc.
    }

    # rest of processing...
}

This approach has an added benefit: if your regex pattern doesn't have any special characters in it, you can just pass it in directly:

apply_regex('foo', 'bar');

Or, if it does, you can use the qr// quoting-operator to create a regex object and pass that as the first parameter:

apply_regex(qr{(foo|bar)}, 'baz');
apply_regex(qr/[ab]+/, '(one or more of "a" or "b")');
apply_regex(qr|\d+|); # Delete any sequences of digits

Most of all, you really don't need eval or the use of code-references/closures for this task. That will only add complexity that may make debugging harder than it needs to be.

Randy

rjray
+2  A: 

s/// is not a regex. Thus, you can't pass it as a regex.

I don't like eval for this, it's very fragile, with a lot of bordercases.

I think it's best to take an approach similar to the one Javascript takes: pass both a regex (in Perl, that is qr//) and a code reference for the substitution. For example, to pass parameters to get the same effect as

s/(\w+)/\u\L$1/g;

You can call

replace($string, qr/(\w+)/, sub { "\u\L$1" }, 'g');

Note that the 'g' modifier is not actually a flag for the regex (I think attaching it to the regex is a design mistake in Javascript), so I chose to pass it in a 3rd parameter.

Once the API has been decided on, implementation can be done next:

sub replace {
    my($string, $find, $replace, $global) = @_;
    unless($global) {
        $string =~ s($find){ $replace->() }e;
    } else {
        $string =~ s($find){ $replace->() }ge;
    }
    return $string;
}

Let's try it:

print replace('content-TYPE', qr/(\w+)/, sub { "\u\L$1" }, 'g');

Result:

Content-Type

That looks good to me.

bart
Data::Munge on CPAN has a 'replace' function similar to this, though it passes substring matches to the fuction, or parses the replacement as a string.
MkV
A: 

The last one is great - thanks! It's so simple that it makes you wonder why perl doesn't have a regex flag just for this case

A: 

You're right - you were very close:

eval('$txt =~ ' . "$pattern;");

pevgeniev