tags:

views:

553

answers:

3

Hi,

I'm using a Perl program to extract text from a file. I have an array of strings which I use as delimiters for the text, e.g:

$pat = $arr[1] . '(.*?)' . $arr[2];

if( $src =~ /$pat/ ) {
   print $1;
   }

However two of the strings in the array are '$450' and '(Buy now)'. The problem is that the symbols in the strings represent end of string and group in Perl regex. which is why the text wont parse properly.

Can anyone suggest me a way around this?

Many Thanks, Joe.

+5  A: 

Try Perl's quotemeta function. Alternatively, use \Q and \E in your regex to turn off interpolation of values in the regex. See perlretut for more on \Q and \E - they may not be what you're looking for.

Chris Lutz
Specifically, \Q won't protect against backslash-escaped characters. quotemeta is by far the more general solution.
Ben Blank
+2  A: 

Use quotemeta :

$pat = quotemeta($arr[1]).'(.*?)'.quotemeta($arr[2]);
if($src=~$pat) print $1;
VirtualBlackFox
+6  A: 

quotemeta escapes meta-characters so they are interpreted as literals. As a shortcut, you can use \Q...\E in double-quotish context to surround stuff that should be quoted:

$pat = quotemeta($arr[1]).'(.*?)'.quotemeta($arr[2]);
if($src=~$pat) { print $1 }

or

$pat = "\Q$arr[1]\E(.*?)\Q$arr[2]";  # \E not necessary at the end
if($src=~$pat) { print $1 }

or just

if ( $src =~ /\Q$arr[1]\E(.*?)\Q$arr[2]/ ) { print $1 }

Note that this isn't limited to interpolated variables; literal characters are affected too:

perl -wle'print "\Q.+?"'
\.\+\?

though obviously it happens after variable interpolation, so "\Q$foo" doesn't become '\$foo'.

ysth