tags:

views:

304

answers:

4

I use a encoded string as a key in array, and also uses the same string as a value in the array, like below code indicates:

$string = 'something in some encode';
$list = array();
$list[$string]['name'] = $string;

when I print_r the array out(just print_r without headers/encoding specific), found that the key in the array and it's 'name' value are not as printed as a same string, it seems to have different encoding.

I'm doing this with chinese character. In php.ini I don't have specific encoding line(Don't know whether it has anything to do with this).

Is there anything about the string encoding in Array keys? Or just I got them in a wrong way? Thanks for your help.

+1  A: 

A key is of type integer or string.

To quote http://de.php.net/manual/de/language.types.string.php

A string is series of characters. Before PHP 6, a character is the same as a byte. That is, there are exactly 256 different characters possible. This also implies that PHP has no native support of Unicode. See utf8_encode() and utf8_decode() for some basic Unicode functionality.

So it makes sense in your case to encode the string used as key (or only the key, depends on what you will do): http://de.php.net/manual/de/function.utf8-encode.php

initall
A: 

I don't know if you can encode correctly string to use them as keys in an array, but even if it is possible to use such variables names :

  • $élement = 'foo';
  • $garçon = 'bar';

(note the ç and the é)

It is not recommended. You should not rely on this.

You would probably map with current english name or use indexes.

For utf8 encoding, take a look at the php manual.

Boris Guéry
A: 

Hi CNBorn,

I tried in Japanese (as is what I can test):

$test["要"]["name"] = "要";
print_r($test);

And the result went fine, as expected. I'm using UTF-8 for everything. I'm not sure if its a problem with your encoding settings (in php.ini) or the encoding you are using. if that is a problem, why don't you try to encode it with base64? (or other Ascii encoder). That way would be something like:

$test["6KaB"]["name"] = "要";

I'm not sure what is your goal, so let me know if it was useful.

lepe
Oneway encoded keys wouldn't be very useful for him and take more time to compute as simply adding string via `$test[] = `. Better encoding them in the first place as key-length isn't limited.
initall
This is helpful, Thanks! Although I don't know what causes the problem, but for people are doing this in my way, you should make base64/md5/sha things as array key instead of encoded strings. Thus you won't encountering 'multiple encodings' problem in an array.Still don't know if this is related to php.ini. If I have time, I'll try to try it out.
CNBorn
Good to know it was useful. Other idea is to convert your encoding to utf8 before passing it as key (in case they are not UTF8 which is my guess). Have a nice day.
lepe
A: 

Are you viewing it through your browser? Then you need to specify the encoding:

header('Content-Type: text/plain; charset=UTF-8'); // or BIG5, or whatever

Are you viewing it in your terminal? Make sure your terminal settings are set to that same encoding.

janmoesen
Thanks, I have tried this way, but it's not working. I don't want the characters to be readable but to be exactly the same utf-8 encode that I can encode them to json. As I said, I can still know the characters are not in same encoding even if I don't have a proper encode setting to read them. Because the same one(should be) are displayed differently.
CNBorn