tags:

views:

2374

answers:

6

Hi,

I'm using CURL to import some code. However, in french, all the characters come out funny. For example: Bonjour ...

I don't have access to change anything on the imported code. Is there anything I can do my side to fix this?

Thanks

A: 

Your situation is unclear. Where does PHP come in? Is the content you're downloading PHP code? What are you using to view the text afterwards?

It's almost certainly just a case of handling the downloaded data in the appropriate encoding. However, you'll need to know what encoding that is (look at the HTTP headers for a possible hint, although it may not have been set correctly) and how to use the right encoding. We can't help you on the latter point until we know what you're doing with the data after fetching it.

Jon Skeet
+2  A: 

Like Jon Skeet pointed it's difficult to understand your situation, however if you have access only to final text, you can try to use iconv for changing text encoding.

I.e.

$text = iconv("Windows-1252","UTF-8",$text);

I've had similar issue time ago (with Italian language and special chars) and I've solved it in this way.

Try different combination (UTF-8, ISO-8859-1, Windows-1252).

Alekc
A: 

PHP seems to use UTF-8 by default, so I found the following works

$text = iconv("UTF-8","Windows-1252",$text);

A: 

I'm currently suffering a similar problem, i'm trying to write a simple html <title> importer cia cURL. So i'm going to give an idea of what i've done until now:

  1. Retrieve the HTML via cURL
  2. Check if there's any hint of encoding on the response headers via curl_getinfo() and match it via regex
  3. Parse the HTML for the purpose of looking at the content-type meta and the <title> tag (yes, i know the consequences)
  4. Compare both content-type, header and meta and choose the meta one if it's different, because we know noone cares about their httpd configuration and there are a lot of dirt workarounds using it
  5. iconv() the string
  6. Whish everyday that when someone does not follow the standards $DEITY punishes him/her until the end of the days, because it would save me the meta parsing
rmontagud
A: 

Thanks guys, this worked great for me.

Andrew
A: 

$text = iconv("UTF-8","Windows-1252",$text);

that was definitely the answer for me too..

i was getting this type of characters.

– …

trialanderror