views:

71

answers:

3

Hi folks, I've got a PHP script that is reading in some JSON data provided by a client. The JSON data provided had a single 'smart quote' in it.

Example:

{
    "title"         : "Lorem Ipsum’s Dolar" 
}

In my script I'm using a small function to get the json data:

public function getJson($url) {
    $filePath = $url;
    $fh = fopen($filePath, 'r') or die();
    $temp = fread($fh, filesize($filePath));
    $temp = utf8_encode($temp);
    echo $temp . "<br />";
    $json = json_decode($temp);
    fclose($fh);
    return $json;
}

If I utf8 encode the data, when I echo it out I see nothing where the quote should be. If I don't utf8 encode the data, when I echo it out I see the funny question mark symbol �

Any thoughts on how to actually see the proper character??

Thanks!

A: 

The issue is more on the side, that generates the JSON file. There you should escape the ' by \'

If you can't modify this part, you should do it like this with addslashes:

$temp = fread($fh, filesize($filePath));
$temp = utf8_encode($temp);
echo $temp . "<br />";
$temp = addslashes($temp);
$json = json_decode($temp);
JochenJung
The JSON file is hand generated by a client, and I'm trying to avoid making them go out of their way to do anything special. The addslashes() did not work. Wouldn't that add slashes before ALL the quotes (including the ones that JSON requires)?
jlbruno
Oh, yea, you're right.Use$temp = str_replace("’", "\\’", $temp); to just escape the ' and leave the " as they are.
JochenJung
A: 

Can you possibly do a string replace assuming the data is all utf8?

$text = str_replace($find, $replace, $text);

Looking for the characters below?

 '“'  // left side double smart quote
 'â€'  // right side double smart quote
 '‘'  // left side single smart quote
 '’'  // right side single smart quote
 '…'  // elipsis
 '—'  // em dash
 '–'  // en dash
subv3rsion
Also same question/duplicate here http://stackoverflow.com/questions/175785/how-do-i-convert-word-smart-quotes-and-em-dashes-in-a-string
subv3rsion
Replacing some random problematic characters is not a solution for encoding problems. Find out the encoding the sender uses and than use a library function to do the translation.
Jörn Horstmann
I've tried str_replace without any luck. It doesn't seem to find the problem characters.
jlbruno
A: 

Is it possibe that the server is sending the json data in an encoding like windows-1252? That codepage has some smart code characters where iso-8859 has control characters. Could you try to use iconv("windows-1252", "utf-8", $temp) instead of utf8_encode. Even better would be if the server already sends utf-8 encoded json, since that is the recommended encoding per rfc4627.

Jörn Horstmann
I think that might have worked?? Doing some extra testing. The files are hand coded by the client (don't ask) and the files are encoded ANSI not UTF-8. I've tried converting to UTF-8 to see if that helped, but for some reason that actually seemed to blow up my entire script.
jlbruno
"ANSI" is usually a broad term for one of the windows codepages. I' actually not sure if iconv is available in a default php installation. You could also try `mb_convert_encoding` which seems to do the same but with the parameters in a different order.
Jörn Horstmann