views:

51

answers:

1

Possible Duplicate:
How can I escape meta-characters when I interpolate a variable in Perl's match operator?

I am using the following regex to search for a string $word in the bigger string $referenceLine as follows :

$wordRefMatchCount =()= $referenceLine =~ /(?=\b$word\b)/g

The problem happens when my $word substring contains some (, etc. Because it takes it as a part of the regex rather than the string to match and gives the following error :

Unmatched ( in regex; marked by <-- HERE in 
m/( <-- HERE ?=\b( darsheel safary\b)/ 
at ./bleu.pl line 119, <REFERENCE> line 1.

Can somone please tell me a solution to this? I think If I could somehow get perl to understand that we want to look for the whole $word as it is without evaluating it, it might work out.

+2  A: 

Use

$wordRefMatchCount =()= $referenceLine =~ /(?=\b\Q$word\E\b)/g

to tell the regex engine to treat every character in $word as a literal character.

\Q marks the start, \E marks the end of a literal string in Perl regex.

Alternatively, you could do

$quote_word = quotemeta($word);

and then use

$wordRefMatchCount =()= $referenceLine =~ /(?=\b$quote_word\b)/g

One more thing (taken up here from the comments where it's harder to find:

Your regex fails in your example case because of the word boundary anchor \b. This anchor matches between a word character and a non-word character. It only makes sense if placed around actual words, i. e. \bbar\b to ensure that only bar is matched, not foobar or barbaric. If you put it around non-words (as in \b( darsheel safary\b) then it will cause the match to fail (unless there is a letter, digit or underscore right before the ().

Tim Pietzcker
But from [perlop](http://perldoc.perl.org/perlop.html) "Because the result of "\Q STRING \E" has all metacharacters quoted, there is no way to insert a literal $ or @ inside a \Q\E pair. If protected by \ , $ will be quoted to became "\\\$" ; if not, it is interpreted as the start of an interpolated scalar." Now I'm confused as they seem in disagreement :(
pst
Thanks!!!!!! It works!!!!!!!
Radz
@pst: Are you sure about that? http://perldoc.perl.org/functions/quotemeta.html suggests otherwise. I don't know Perl any more than I can read the docs, so I can't try it here.
Tim Pietzcker
I used \Q....\E...so now I got rid of the error but the matching does not seem to work now. It does not match the substring in the bigger string.
Radz
If your substring really is `( darsheel safary` then it can't match because the `\b` before it will not match. This has nothing to do with the interpolation.
Tim Pietzcker
Hi Tim, I tried that too. Doesnt match still. This is the output I am printing just FYI, Ref Line After : ( darsheel safary ) ! eight - year - old school ! . animals : ishaan ' s " failures " We could not find the word in REFERENCE : ( darsheel safary )
Radz
To explain: `\b` matches between an alphanumeric character and a non-alphanumeric character, so it won't match before a `(` unless there is a letter, digit or underscore right before it.
Tim Pietzcker
ohhhh....I didnt consider that case...before I was matching only words after scrubbing off special chars...Now our prof changed the assignment to have special chars in it...Thanks so much!
Radz
`$variable` interpolation still works; it is just that special characters aren't treated as special by the regex engine. And indeed if `$variable` could possibly contain special characters, `/\Q$variable\E/` is the best way to go (and is good defensive programming even if you think the variable is clean).
Ether