views:

1041

answers:

2

Hello,

I use JSON to encode an array, and I get a string like this:

{"name":"\u00fe\u00fd\u00f0\u00f6\u00e7"}

Now I need to convert this to ISO-8859-9. I tried the following but it fails:

header('Content-type: application/json; charset=ISO-8859-9');
$json = json_encode($response);
$json = utf8_decode($json);
$json = mb_convert_encoding($json, "ISO-8859-9", "auto");
echo $json;

It doesnt seem to work. What am I missing?

Thank you for your time.

A: 

As you can see on the PHP documentation site JSON encoding/decoding functions only work with utf8 encoding, so trying to change this can cause you some data problems, you may get unexpected and wrong results.

Khriz
after i have the output from json (which is in utf8) can't i convert it to 'ISO-8859-9'?
Alec Smart
yes, but the problem is after the conversion when you do a decode you can get wrong data.If you don't need to do a decode, for example you read it in Javascript, you will get all the data correctly if you page is ISO
Khriz
+2  A: 

You can do:

$json = json_encode($response);
header('Content-type: application/json; charset=ISO-8859-9');
echo mb_convert_encoding($json, "ISO-8859-9", "UTF-8");

Assuming that strings in $response is in utf-8. But I would strongly suggest that you just use utf-8 all the way through.

Edit: Sorry, just realised that won't work, since json_encode escapes unicode points as javascript escape codes. You'll have to convert these to utf-8 sequences first. I don't think there are any built-in functionality for that, but you can use a slightly modified variation of this library to get there. Try the following:

function unicode_hex_to_utf8($hexcode) {
  $arr = array(hexdec(substr($hexcode[1], 0, 2)), hexdec(substr($hexcode[1], 2, 2)));
  $dest = '';
  foreach ($arr as $src) {
    if ($src < 0) {
      return false;
    } elseif ( $src <= 0x007f) {
      $dest .= chr($src);
    } elseif ($src <= 0x07ff) {
      $dest .= chr(0xc0 | ($src >> 6));
      $dest .= chr(0x80 | ($src & 0x003f));
    } elseif ($src == 0xFEFF) {
      // nop -- zap the BOM
    } elseif ($src >= 0xD800 && $src <= 0xDFFF) {
      // found a surrogate
      return false;
    } elseif ($src <= 0xffff) {
      $dest .= chr(0xe0 | ($src >> 12));
      $dest .= chr(0x80 | (($src >> 6) & 0x003f));
      $dest .= chr(0x80 | ($src & 0x003f));
    } elseif ($src <= 0x10ffff) {
      $dest .= chr(0xf0 | ($src >> 18));
      $dest .= chr(0x80 | (($src >> 12) & 0x3f));
      $dest .= chr(0x80 | (($src >> 6) & 0x3f));
      $dest .= chr(0x80 | ($src & 0x3f));
    } else {
      // out of range
      return false;
    }
  }
  return $dest;
}

print mb_convert_encoding(
  preg_replace_callback(
    "~\\\\u([1234567890abcdef]{4})~", 'unicode_hex_to_utf8',
    json_encode($response)),
  "ISO-8859-9", "UTF-8");
troelskn
I am getting an output like this now: �?�?�?�ö�ç"Its close but not yet there? Please help.
Alec Smart
How are you consuming the json?
troelskn
Basically sending it to jQuery on an ajax call. The page has charset ISO-8859-9 encoding (unfortunately I cant control the page charset). My plugin basically adds on top of the site. How do I fix this mess?
Alec Smart
If Javascript is consuming the data, you shouldn't change the charset. The encoding that the page is served in, doesn't affect Javascripts internal charset. Just send json data as unicode-escapes (the default behaviour of json_encode). To do so, you need to convert your php-strings to UTF-8, and then pass them to json_encode. That's all really.
troelskn