tags:

views:

496

answers:

2

Hi,

I am trying to pull data from another site and i am getting unicode characters in my result like this

Amazon RDS – The Beginner’s Guide

how can i decode it in php? Can someone help?

Thanks in advance

+2  A: 

See utf8_decode:

echo utf8_decode($str);
karim79
... provided he is using ISO-8859-1 as his output format (which i most probably is).
Pekka
@Pekka Gaiser - this is true, also worth noting that `mb_detect_encoding` can be used to check what the encoding is set to.
karim79
Not really true, I'm afraid. `mb_detect_encoding` makes a guess, based on heuristics. It has all sorts of pitfalls. Specifically, it's useless for distinguishing between various `iso-8859-X` encodings.
troelskn
Seems like he is using windows-1252, not iso-8859-1.
troelskn
i used utf8_decode($str) function,but it is replacing those characters with question mark(?)..
someone
@someone: Yes, that's because `utf8_decode` translates from utf8->iso-8859-1 and you are using windows-1252, not iso-8859-1. See my answer.
troelskn
+1  A: 

Those are not "unicode characters" - Those are artefacts of messed-up character encoding. In this case, the most likely explanation is that you are interpreting utf-8 data as windows-1252. This may happen if you take a utf-8 encoded string and display it in a shell on windows. Or if you display it on a web page, sending a Content-Type header with charset=windows-1252. Just educated guesses of course, there are numerous ways this could happen.

The solution to your problem is to treat the data as utf-8.

troelskn