views:

270

answers:

8

I came to know PHP after Perl, so when I first found preg_* function I basically just used those. Later I read that str_replace() is faster when dealing with literal text. So my question is, can't preg_replace() be as efficient as str_replace() when the search pattern does not use special characters? Maybe just analyzing the pattern to choose between regex and plain text algorithms?

A: 

preg is good for complicated replacements to text (text hyperlink to the actual link). the other is for like changing words (like a word filter).

if you've got blanks that can be matched by a pattern, use preg otherwise attempt with str_replace.

although trying to do the same in preg with str_replace, is actually slower if you're doing complicated stuff.

joe

Joe Simpson
+5  A: 

In theory yes, you're right. It is possible the PHP team could jigger preg_replace to analyze the pattern being passed in and then use the code for str_replace if it didn't see any meta-characters. Assuming the analysis wasn't too heavy, this might yield better performance results.

However, the way the PHP source code (that is, the code used to implement PHP) is organized doesn't lend itself well to this sharing. PHP is (in some ways) less a full language and more a collection of modules.

So, initially the PHP group chose to stay away from this kind of cross module pollination. At this point, changing the preg_replace function to do that kind of analysis would risk breaking a lot of code, and the performance improvements would be minuscule.

Finally, the analysis itself is a harder problem to solve than you'd think. Tell me, does this pattern

 '/123/'

mean I should search for the literal text

123

or the literal text

/123/

It's easy to come up with compelling arguments for either interpretation, which introduces an additional level of confusion into using the function.

An interesting idea in theory, but in practice and the context of the PHP universe, it creates far more problems than it solves.

Alan Storm
It's worth noting that PHP is not the only language that separates normal string functions from regexp string functions. Javascript does it, SQL implementations do it, Java does it, etc. Perl seems to be the exception here.
zombat
In your example the literal text is clearly `/123/` as the single quotes are the pattern delimiters :)
kemp
single quote are string delimiters.
el.pescado
@kemp exactly ;)
Alan Storm
+3  A: 

Maybe just analyzing the pattern to choose between regex and plain text algorithms?

I'd rather not be forced to escape everything that has special meaning in regular expressions every time I just want to replace some substrings.

Michael Borgwardt
A: 

I guess the differences in speed yield to the overhead the regex parser/engine adds in comparison to how str_* operates. But I'm just guessing here. In case of doubt, benchmark and see if it can be faster or same speed :)

There is a lengthy and detailed article about Regular Expression Matching Speed and Wikipedia has some info about Implementations and Running times and a Comparison of Regular Expression engines.

Gordon
A: 

Despite similarities, both functions are quite different, thus not interchangeable. For example, replacement in preg_replace can contain backreferences to text captured by regular expression:

preg_replace ('/(\w+) apple/', '$1 pear', 'A red apple'); // => 'A red pear'
el.pescado
A: 

this is how it works in javascript

alert("a.b".replace(".", "X")) // aXb
alert("a.b".replace(/./, "X")) // X.b

that is, one function can accept both substrings and special regexp literals. Regexp literals are extremely handy and the whole string library can be made smaller and more flexible (think of one single split instead of "explode" and "preg_split", pos instead of "strpos" and "preg_match" etc).

that being said, i highly doubt regexp literals can be added to php any time soon.

stereofrog
A: 

Maybe just analyzing the pattern to choose between regex and plain text algorithms?

This alone would reduce performance. Also, preg_*() functions use a library that isn't necessary for simpler string operations.

Jordan Ryan Moore
Not really, analizing the pattern is an operation done once on a (generally) very small entity (the search) pattern, and doesn't scale in complexity with the size of the text to be searched. In fact, as far as I know, all string matching algorithm do an analysis of the pattern as first thing.
kemp
A: 

It is not possible to replace str_replace() with preg_replace() because the function could not understand if I am trying to use a pattern matching, or a normal string replacement. It would be possible to do that if the new function would accept a new parameter, but in this case you would introduce an incompatibility issue for old code.

Changing preg_replace() to make it understand that it should make a string replacement would not make it optimized. It should check the string passed as argument, and understand that I am requesting to replace a string with another one; checking for that would take time that could be used to resolve the pattern matching.

kiamlaluno