tags:

views:

71

answers:

2

I'm not sure if one of these is 'better' then the other, and why it would be, but I've got an original string that looks like this:

$string = '/random_length_user/file.php';

Now, there are two ways to match it, the first, using my new friend, the look-behind, and the 2nd, without:

preg_match("%(?<=^/)([^/]*)%", $string, $capture);
preg_match("%^/([^/]*)%", $string, $capture);

They return, in order:

Array
(
    [0] => random_length_user
)
Array
(
    [0] => /random_length_user
    [1] => random_length_user
)

Essentially I get the result I want in $capture[0] using look-behind, and in $capture[1] without. Now the question is ... is there a reason to prefer one of these methods over the other?

+3  A: 

It probably doesn't make a difference with preg_match, but it will matter when using preg_replace, as it affects what will be replaced.

It might also be an issue when you do a global match, because the capturing group will consume characters, while the lookarounds will not

Trivial example:

  • /(?<=a)a/g with 'aaaa' gives Array('a', 'a', 'a')
  • /(a)a/g with 'aaaa' gives Array('aa', 'aa')
K Prime
+1  A: 

The problem is that the lookbehind approach is not as flexible; it falls down when you start dealing with variable-length matches. For example, suppose you wanted to extract the file name in your example, and you didn't know the name of the directory. The capturing-group technique still works fine:

preg_match("%^/\w+/([^/]*)%", '/random_length_user/file.php');

Array
(
    [0] => /random_length_user/file.php
    [1] => file.php
)

...but the lookbehind approach doesn't, because lookbehind expressions can only match a fixed number of characters. However, there's an even better alternative: \K, the MATCH POINT RESET operator. Wherever you put it, the regex engine pretends the match really started there. So you get the same result as you would with a lookbehind, without the fixed-length limitation:

preg_match('%^/\w+/\K[^/]+$%', '/random_length_user/file.php');

Array
(
    [0] => file.php
)

As far as I know, this feature is only available in Perl 5.10+ and in tools (like PHP's preg_ functions) that are powered by the PCRE library. For the PCRE reference, see the manpage and search (F3) for \K.

Alan Moore