views:

1142

answers:

3

It was hinted in a comment to an answer to this question that PHP can not reverse Unicode strings.

As for Unicode, it works in PHP because most apps process it as binary. Yes, PHP is 8-bit clean. Try the equivalent of this in PHP: perl -Mutf8 -e 'print scalar reverse("ほげほげ")' You will get garbage, not "げほげほ". – jrockway

And unfortunately it is correct that PHPs unicode support atm is at best "lacking". This will hopefully change drastically with PHP6.

PHPs MultiByte functions does provide the basic functionality you need to deal with unicode, but it is inconsistent and does lack a lot of functions. One of these is a function to reverse a string.

I of course wanted to reverse this text for no other reason then to figure out if it was possible. And I made a function to accomplish this enormous complex task of reversing this Unicode text, so you can relax a bit longer until PHP6.

Test code:

$enc = 'UTF-8';
$text = "ほげほげ";
$defaultEnc = mb_internal_encoding();

echo "Showing results with encoding $defaultEnc.\n\n";

$revNormal = strrev($text);
$revInt = mb_strrev($text);
$revEnc = mb_strrev($text, $enc);

echo "Original text is: $text .\n";
echo "Normal strrev output: " . $revNormal . ".\n";
echo "mb_strrev without encoding output: $revInt.\n";
echo "mb_strrev with encoding $enc output: $revEnc.\n";

if (mb_internal_encoding($enc)) {
    echo "\nSetting internal encoding to $enc from $defaultEnc.\n\n";

    $revNormal = strrev($text);
    $revInt = mb_strrev($text);
    $revEnc = mb_strrev($text, $enc);

    echo "Original text is: $text .\n";
    echo "Normal strrev output: " . $revNormal . ".\n";
    echo "mb_strrev without encoding output: $revInt.\n";
    echo "mb_strrev with encoding $enc output: $revEnc.\n";

} else {
    echo "\nCould not set internal encoding to $enc!\n";
}
+4  A: 

The answer

function mb_strrev($text, $encoding = null)
{
    $funcParams = array($text);
    if ($encoding !== null)
        $funcParams[] = $encoding;
    $length = call_user_func_array('mb_strlen', $funcParams);

    $output = '';
    $funcParams = array($text, $length, 1);
    if ($encoding !== null)
        $funcParams[] = $encoding;
    while ($funcParams[1]--) {
         $output .= call_user_func_array('mb_substr', $funcParams);
    }
    return $output;
}
OIS
If I didn't know any better I'd say that you posted the question and answer just for the points as both were posted at the same time.
Evan Fosmark
I posted this because 1. This is supposed to be a FAQ site with all questions answered. 2. It would save someone else the trouble. 3. Im tired of all the "You cant do this with PHP". Some of them are true, but not most.
OIS
I am however wondering if this can be made community wiki? Im not sure what the "community wiki" is for.
OIS
this is what community wiki is for.
Tim Howland
Aye found it. http://stackoverflow.com/questions/128434/what-are-community-wiki-posts-in-stackoverflow
OIS
Yeah, it would be really scary if you gained points for helping php programmers gain time when they stumble across this problem.
PEZ
This function is very inefficient. You can do this linear time.
Artefacto
+1  A: 

Here's another way. This seems to work without having to specify an output encoding (tested with a couple of different mb_internal_encodings):

function mb_strrev($text)
{
    return join('', array_reverse(
        preg_split('~~u', $text, -1, PREG_SPLIT_NO_EMPTY)
    ));
}
searlea
+1  A: 

here's another approach using regex:

function utf8_strrev($str){
 preg_match_all('/./us', $str, $ar);
 return implode(array_reverse($ar[0]));
}
Fixtree