I'm writing a WordPress plugin, and one of the features is removing duplicate whitespace.
My code looks like this:
return preg_replace('/\s\s+/u', ' ', $text, -1, $count);
I don't understand why I need the
u
modifier. I've seen other plugins that usepreg_replace
and don't need to modify it for Unicode. I believe I have a default installation of WordPress .Without the modifier, the code replaces all the spaces with Unicode replacement glyphs instead of spaces.
With the
u
modifier, I don't get the glyphs, and it doesn't replace all the whitespace.
Each space below has from 1-10 spaces. The regex only removes on space from each group.
Before:
This sentence has extra space. This doesn’t. Extra space, Lots of extra space.
After:
This sentence has extra space. This doesn’t. Extra space, Lots of extra space.
$count
= 9
How can I make the regex replace the whole match with the one space?
Update: If I try this with regular php, it works fine
$new_text = preg_replace('/\s\s+/', ' ', $text, -1, $count);
It only breaks when I use it within the wordpress plugin. I'm using this function in a filter:
function jje_test( $text ) {
$new_text = preg_replace('/\s\s+/', ' ', $text, -1, $count);
echo "Count: $count";
return $new_text;
}
add_filter('the_content', 'jje_test');
I have tried:
- Removing all other filters on the_content
remove_all_filters('the_content');
- Changing the priority of the filter added to the_content, earlier or later
- All kinds of permutations of
\s+, \s\s+, [ ]+
etc. - Even replacing all single spaces with an empty string, will not replace the spaces