views:

264

answers:

5

How can I convert a string to a regular expression that matches itself in Perl?

I have a set of strings like these:

Enter your selection:
Enter Code (Navigate, Abandon, Copy, Exit, ?):

and I want to convert them to regular expressions sop I can match something else against them. In most cases the string is the same as the regular expression, but not in the second example above because the ( and ? have meaning in regular expressions. So that second string needs to be become an expression like:

Enter Code \(Navigate, Abandon, Copy, Exit, \?\):

I don't need the matching to be too strict, so something like this would be fine:

Enter Code .Navigate, Abandon, Copy, Exit, ..:

My current thinking is that I could use something like:

s/[\?\(\)]/./g;

but I don't really know what characters will be in the list of strings and if I miss a special char then I might never notice the program is not behaving as expected. And I feel that there should exist a general solution.

Thanks.

+3  A: 

From http://www.regular-expressions.info/characters.html :

there are 11 characters with special meanings: the opening square bracket [, the backslash \, the caret ^, the dollar sign $, the period or dot ., the vertical bar or pipe symbol |, the question mark ?, the asterisk or star *, the plus sign +, the opening round bracket ( and the closing round bracket )

In Perl (and PHP) there is a special function quotemeta that will escape all these for you.

Espo
Here they are all right next to each other in case you want to make a character class out of it or something: [\^$.|?*+()
Tim
What about `{}`? `$foo =~ /a{1,4}/;`
daotoad
In general, `{` and `}` are not special. They have to appear in a valid pattern matching construct to be special. In any case, `quotemeta` will take care of those as well.
Sinan Ünür
+10  A: 

As Brad Gilbert commented use quotemeta:

my $regex = qr/^\Q$string\E$/;

or

my $quoted = quotemeta $string;
my $regex2 = qr/^$quoted$/;
daotoad
Awesome, thanks, and thanks to Brad.
Anon Guy
+3  A: 

To put Brad Gilbert's suggestion into an answer instead of a comment, you can use quotemeta function. All credit to him

DVK
All hail Br... Who?
Brad Gilbert
+3  A: 

Why use a regular expression at all? Since you aren't doing any capturing and it seems you will not be going to allow for any variations, why not simply use the index builtin?

$s1 = 'hello, (world)?!';
$s2 = 'he said "hello, (world)?!" and nothing else.';

if ( -1 != index  $s2, $s1 ) {
    print "we've got a match\n";
}
else {
    print "sorry, no match.\n";
}
innaM
I'm passing the regex to a function from a CPAN module that is expecting a regex, sadly.
Anon Guy
Now that's a very good reason to use a regular expression.
innaM
+4  A: 

There is a function for that quotemeta.

quotemeta EXPR
Returns the value of EXPR with all non-"word" characters backslashed. (That is, all characters not matching /[A-Za-z_0-9]/ will be preceded by a backslash in the returned string, regardless of any locale settings.) This is the internal function implementing the \Q escape in double-quoted strings.

If EXPR is omitted, uses $_.

Brad Gilbert