How do I, using preg_replace, replace more than one underscore with just one underscore?
The +
operator matches multiple instances of the last character (or capture group).
$string = preg_replace('/_+/', '_', $string);
I'm don't the reasons you want to use preg_replace but what's wrong with:
str_replace('__', '_', $string);
Running tests, I found this:
while (strpos($str, '__') !== false) {
$str = str_replace('__', '_', $str);
}
to be consistently faster than this:
$str = preg_replace('/[_]+/', '_', $str);
I generated the test strings of varying lengths with this:
$chars = array_merge(array_fill(0, 50, '_'), range('a', 'z'));
$str = '';
for ($i = 0; $i < $len; $i++) { // $len varied from 10 to 1000000
$str .= $chars[array_rand($chars)];
}
file_put_contents('test_str.txt', $str);
and tested with these scripts (run separately, but on identical strings for each value of $len):
$str = file_get_contents('test_str.txt');
$start = microtime(true);
$str = preg_replace('/[_]+/', '_', $str);
echo microtime(true) - $start;
and:
$str = file_get_contents('test_str.txt');
$start = microtime(true);
while (strpos($str, '__') !== false) {
$str = str_replace('__', '_', $str);
}
echo microtime(true) - $start;
For shorter strings the *str_replace()* method was as much as 25% faster than the *preg_replace()* method. The longer the string, the less the difference, but *str_replace()* was always faster.
I know some would prefer one method over the other for reasons other than speed, and I'd be glad to read comments regarding the results, testing method, etc.
Actually using /__+/
or /_{2,}/
would be better than /_+/
since a single underscore does not need to be replaced. This will improve the speed of the preg variant.