views:

79

answers:

1

In another question, there are the following lines:

$value='x-Cem-Date:Wed, 16 Dec 2009 15:42:28 GMT';
$value = preg_replace('/(^.+?)(?=:)/e', "strtolower('\\1')", $value);
// yields 'x-cem-date:Wed, 16 Dec 2009 15:42:28 GMT'

That (?=:) bit indicates a search for the colon, it has to. But, I don't understand that particular syntax, with the ?=. What exactly is going on there?

+4  A: 

That's a positive lookahead. It looks whether the particular subexpression occurs after that point. But it doesn't consume anything in the match:

Positive lookahead works just the same. q(?=u) matches a q that is followed by a u, without making the u part of the match. The positive lookahead construct is a pair of round brackets, with the opening bracket followed by a question mark and an equals sign. —RegularExpressions.info

As you may notice, lookaround is especially helpful when replacing text since you don't need to include the surrounding environment into the replacement text. For example, to replace every q that is not followed by a u with qu you can do

replace 'q([^u])' by 'qu\1'

but this captures the following character because it's part of the match and re-inserts it later again. You can also use lookaround:

replace 'q(?!u)' by 'qu'

where only the q gets matched and replaced, so including part of the match in the replacement string isn't necessary anymore.

Joey
Ah yes. It "matches without including". Thanks for putting a name to it.
Derek Illchuk