tags:

views:

181

answers:

3

Is there a built-in way to escape a string that will be used within/as a regular expression? E.g.

www.abc.com

The escaped version would be:

www\.abc\.com

I was going to use:

$string =~ s/[.*+?|()\[\]{}\\]/\\$&/g; # Escapes special regex chars

But I just wanted to make sure that there's not a cleaner built-in operation that I'm missing?

+15  A: 

Use quotemeta or \Q...\E.

Consider the following test program that matches against $str as-is, with quotemeta, and with \Q...\E:

#! /usr/bin/perl

use warnings;
use strict;

my $str = "www.abc.com";

my @test = (
  "www.abc.com",
  "www/abc!com",
);

sub ismatch($) { $_[0] ? "MATCH" : "NO MATCH" }

my @match = (
  [ as_is => sub { ismatch /$str/ } ],
  [ qmeta => sub { my $qm = quotemeta $str; ismatch /$qm/ } ],
  [ qe    => sub { ismatch /\Q$str\E/ } ],
);

for (@test) {
  print "\$_ = '$_':\n";

  foreach my $method (@match) {
    my($name,$match) = @$method;

    print "  - $name: ", $match->(), "\n";
  }
}

Notice in the output that using the string as-is could produce spurious matches:

$ ./try
$_ = 'www.abc.com':
  - as_is: MATCH
  - qmeta: MATCH
  - qe: MATCH
$_ = 'www/abc!com':
  - as_is: MATCH
  - qmeta: NO MATCH
  - qe: NO MATCH

For programs that accept untrustworthy inputs, be extremely careful about using such potentially nasty bits as regular expressions: doing so could create unexpected runtime errors, denial-of-service vulnerabilities, and security holes.

Greg Bacon
Perfect! Thanks gbacon!
J-P
You're welcome! I'm glad it helps.
Greg Bacon
+7  A: 

The best way to do this is to use \Q to begin a quoted string and \E to end it.

my $foo = 'www.abc.com';
$bar =~ /blah\Q$foo\Eblah/;

You can also use quotemeta on the variable first. E.g.

my $quoted_foo = quotemeta($foo);

The \Q trick is documented in perlre under "Escape Sequences."

friedo
+4  A: 

Of course everybody knows quotemeta, but from description of the problem, I can't help but notice that simple index() will be as well (or better) suited to solve it.

depesz