tags:

views:

74

answers:

2

I have this class in a UTF-8 encoded file called EnUTF8.Class.php:

class EnUTF8 {

    public function ñññ() {

        return 'ñññ()';

    }
}

and in another UTF-8 encoded file:

require_once('EnUTF8.Class.php');
require_once('OneBuggy.Class.php');

$utf8 = new EnUTF8();
//$buggy = new OneBuggy();

echo (method_exists($utf8, 'ñññ')) ? 'ñññ() exists!' : 'ñññ() does not exist...';

echo "\n\n----------------------------------\n\n"

print_r(get_class_methods($utf8));

echo "\n----------------------------------\n\n"

echo $utf8->ñññ();

that produces the expected result:

ñññ() exists!

----------------------------------

Array
(
    [0] => ñññ
)

----------------------------------

ñññ()

but if...

require_once('EnUTF8.Class.php');
require_once('OneBuggy.Class.php');

$utf8 = new EnUTF8();
$buggy = new OneBuggy();

echo (method_exists($utf8, 'ñññ')) ? 'ñññ() exists!' : 'ñññ() does not exist...';

echo "\n\n----------------------------------\n\n"

print_r(get_class_methods($utf8));

echo "\n----------------------------------\n\n"

echo $utf8->ñññ();

then the weirdness appears!!!:

ñññ() does not exist!

----------------------------------

Array
(
[0] => ñññ
)

----------------------------------

Fatal error: Call to undefined method EnUTF8::ñññ() in /var/www/test.php on line 16

Well, the thing is that OneBuggy.Class.php is UTF-8 encoded too and shares absolutly nothing with EnUTF8.Class.php so...

where is the bug?

UPDATED:

Well, after a long debugging time I found this in OneBuggy.Class.php constructor:

setlocale (LC_ALL, "es_ES@euro", "es_ES", "esp");

so I did...

//setlocale (LC_ALL, "es_ES@euro", "es_ES", "esp");

and now it works but why?.

+1  A: 

If you are working with PHP 5.x, you should not develop using names in UTF-8 for your variables/classes/functions/... : in some cases, for some characters, it will work, but in a general situation, it will not.

And note this is true for identifiers, but you'll have the same problem for the content of variables, for instance -- as an example, to manipulate strings in UTF-8, you have to work with the mb_* familly of functions.


This is because PHP 5.x is not really using Unicode : it's the big thing that's planned for PHP 6 (which is not even in alpha stage yet).

Pascal MARTIN
+1  A: 

Re your update, I think it goes into this direction:

With setlocale(), among other things, you set

LC_CTYPE for character classification and conversion, for example strtoupper()

method_exists() is case insensitive, so within method_exists(), some case conversion must take place. I bet the string breaks at that point. Why it would break if you explicitly set the spanish locale, but not if you don't, I don't understand, though.

Is there a specific spanish rule for uppercasing ñ other than making it Ñ? Is it possible to lowercase ñ?

It could also be that the spanish locale the function is trying to switch to isn't installed on your system at all, and the fallback locale is a different one than PHP uses by default.

Pekka