views:

745

answers:

5

I'v tried converting the text to or from utf8… didn't seem to help

Im getting:

"It’s Getting the Best of Me"

It should be:

"It’s Getting the Best of Me"

Im getting this data from a url -> http://www.tvrage.com/quickinfo.php?show=Surviver&ep=20x02&exact=0

+1  A: 

I looked at the link, and it looks like UTF-8 to me. i.e., in Firefox, if you pick View, Character Encoding, UTF-8, it will appear correctly.

So, you just need to figure out how to get your PHP code to process that as UTF-8. Good luck!

Chris Jester-Young
Try htmlspecialchars_decode
Levi Hackwith
Nop, didn't change at all.
Mint
+1  A: 

It sounds like you're using standard string functions on a UTF8 characters (’) that doesn't exist in ISO 8859-1. Check that you are using Unicode compatible PHP settings and functions. See also the multibyte string functions.

pr1001
+2  A: 

To convert to HTML entities:

<?php
  echo mb_convert_encoding(
    file_get_contents('http://www.tvrage.com/quickinfo.php?show=Surviver&amp;ep=20x02&amp;exact=0'),
    "HTML-ENTITIES",
    "UTF-8"
  );
?>

See docs for mb_convert_encoding for more encoding options.

konforce
That works, though I can't figure out to get it to work on fopen
Mint
Once you get the contents of the file you want, then pass it in as the first parameter to `mb_convert_encoding()`. e.g., `$text = fgets($fp); $html = mb_convert_encoding($text, "HTML-ENTITIES", "UTF-8");`
konforce
Thanks, that works.
Mint
+5  A: 

Your content is fine, the problem is with the headers the server is sending:

Connection:Keep-Alive
Content-Length:502
Content-Type:text/html
Date:Thu, 18 Feb 2010 20:45:32 GMT
Keep-Alive:timeout=1, max=25
Server:Apache/2.2.8 (Ubuntu) PHP/5.2.4-2ubuntu5.7 with Suhosin-Patch
X-Powered-By:PHP/5.2.4-2ubuntu5.7

Content-Type should be set to Content-type: text/plain; charset=utf-8, because this page is not HTML and uses the utf-8 encoding (chromium on mac guesses ISO-8859-1 and displays the characters you're describing)

If you are not in control of the site, specify the encoding as UTF-8 to whatever function you use to retrieve the content (not familiar enough with PHP to know how exactly)

cobbal
`text/text`? Don't you mean `text/plain`?
Chris Jester-Young
Thanks, fixed now
cobbal
+1  A: 

Make sure your html header specifies utf8

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

That usually does the trick for me (obviously if the content IS utf8).

You don't need to convert to html entities if you set the content-type.

Ben