ansaurus

Question

Avoiding processing special preg characters in replacement string

Answer 1

A:

$subject = array('1', 'a', '2', 'b', '3', 'A', 'B', '4'); $pattern = array('/\d/', '/[a-z]/', '/[1a]/'); $replace = array('A:$0', 'B:$0', 'C:$0');

echo "preg_filter returns\n"; print_r(preg_filter($pattern, $replace, $subject));

echo "preg_replace returns\n"; print_r(preg_replace($pattern, $replace, $subject));

preg_filter returns Array ( [0] => A:C:1 [1] => B:C:a [2] => A:2 [3] => B:b [4] => A:3 [7] => A:4 ) preg_replace returns Array ( [0] => A:C:1 [1] => B:C:a [2] => A:2 [3] => B:b [4] => A:3 [5] => A [6] => B [7] => A:4 )

zod 2010-09-30 19:14:45

Sorry, but I'm lost - what's the point of this? I see that it's taken directly from the manual page for `preg_filter()` at http://us3.php.net/preg_filter, but it doesn't really have anything to do with my question. And without any accompanying commentary....

mr. w 2010-09-30 23:48:24

Answer 2

+1 A:

Okay, well, I don't think there's any really satisfying way to handle this. The problems are two in number: the \ character and the $ character. Other PCRE special characters appear to not be special in the replacement.

In the case of \, things actually behave as one would expect in that you need to escape it with \ both with defining it via PHP and when passing it into preg_replace(). In my test code, I was simply confusing myself with the two layers of escaping. As for $, it should be left alone on the PHP side and escaped with \ going into preg_replace(). That's it.

Here's some code to demonstrate all this:

<?php

ini_set('display_errors', 1);
ini_set('error_reporting', E_ALL | E_STRICT);

//real string: "test1 $1 test2 \\1 test3 \${1}"

//real string manually \-escaped once for representing as a PHP string
$test = 'test1 $1 test2 \\\\1 test3 \\${1}';
var_dump('--test (starting PHP string - should match real string)', $test);

$test = str_replace(array('\\', '$'), array('\\\\', '\\$'), $test);
var_dump('--test (PHP string $-escaped and \-escaped again for preg_replace)', $test);

$result = preg_replace("/bar/", $test, 'foo bar baz');

var_dump('--result - bar should be replaced with original real string', $result);

?>

Output:

string(55) "--test (starting PHP string - should match real string)"
string(30) "test1 $1 test2 \\1 test3 \${1}"
string(66) "--test (PHP string $-escaped and \-escaped again for preg_replace)"
string(35) "test1 \$1 test2 \\\\1 test3 \\\${1}"
string(59) "--result - bar should be replaced with original real string"
string(38) "foo test1 $1 test2 \\1 test3 \${1} baz"

My feeling is that preg_quote() should be the solution here, and it would be if preg_replace() would ignore escaped characters other than \ itself and $ (e.g., +). However, it doesn't, forcing one to do the manual escaping. In fact, I would argue that this is a bug, and will pursue filing it as such on php.net.

mr. w 2010-09-30 23:46:04

I've filed a bug - [(#52962)](http://bugs.php.net/bug.php?id=52962).

mr. w 2010-10-01 01:01:53

ansaurus

tags:

views:

answers:

Avoiding processing special preg characters in replacement string

related questions