views:

11235

answers:

5

Hi all.

Some of my script are using different encoding, and when I try to combine them, this has becom an issue.

But I can't change the encoding they use, instead I want to change the encodig of the result from script A, and use it as parameter in script B.

So: is there any simple way to change a string from UTF-8 to ISO-88591 in PHP? I have looked at utf_encode and _decode, but they doesn't do what i want. Why doesn't there exsist any "utf2iso()"-function, or similar?

I don't think I have characters that can't be written in ISO-format, so that shouldn't be an huge issue.

A: 

You need to use the iconv package, specifically its iconv function.

Martin v. Löwis
+11  A: 

Have a look at iconv() or mb_convert_encoding(). Just by the way: why don't utf8_encode() and utf8_decode() work for you?

utf8_decode — Converts a string with ISO-8859-1 characters encoded with UTF-8 to single-byte ISO-8859-1

utf8_encode — Encodes an ISO-8859-1 string to UTF-8

So essentially

$utf8 = 'ÄÖÜ'; // file must be UTF-8 encoded
$iso88591_1 = utf8_decode($utf8);
$iso88591_2 = iconv('UTF-8', 'ISO-8859-1', $utf8);
$iso88591_2 = mb_convert_encoding($utf8, 'ISO-8859-1', 'UTF-8');

$iso88591 = 'ÄÖÜ'; // file must be ISO-8859-1 encoded
$utf8_1 = utf8_encode($iso88591);
$utf8_2 = iconv('ISO-8859-1', 'UTF-8', $iso88591);
$utf8_2 = mb_convert_encoding($iso88591, 'UTF-8', 'ISO-8859-1');

all should do the same - with utf8_en/decode() requiring no special extension, mb_convert_encoding() requiring ext/mbstring and iconv() requiring ext/iconv.

Stefan Gehrig
Thanks for a good answer, and you and the others here are right: utf8_decode() seems to get the work done. There must have been some problems with files or my browser. At least I'm no longer able to reproduce the errors... (Maybe I did something wrong with my browser-charset-settings?)
qualbeen
+2  A: 

First of all, don't use different encodings. It leads to a mess, and UTF-8 is definitely the one you should be using everywhere.

Chances are your input is not ISO-8859-1, but something else (ISO-8859-15, Windows-1252). To convert from those, use iconv.

Nevertheless, utf8_encode and utf8_decode should work for ISO-8859-1. It would be nice if you could post a link to a file or a uuencoded or base64 example string for which the conversion fails or yields unexpected results.

phihag
A: 

I used:

function utf8_to_html ($data) {
return preg_replace(
 array (
  '/ä/',
  '/ö/',
  '/ü/',
  '/é/',
  '/à/',
  '/è/'
 ),
 array (
  'ä',
  'ö',
  'ü',
  'é',
  'à',
  'è'
 ),
 $data );

}

A: 

I use this function:

function formatcell($data, $num, $fill=" ") {
$data = trim($data);
$data=str_replace(chr(13),' ',$data);
$data=str_replace(chr(10),' ',$data);
// translate UTF8 to English characters
$data = iconv('UTF-8', 'ASCII//TRANSLIT', $data);
$data = preg_replace("/[\'\"\^\~\`]/i", '', $data);


// fill it up with spaces
for ($i = strlen($data); $i < $num; $i++) {
    $data .= $fill;
}
// limit string to num characters
$data = substr($data, 0, $num);

return $data;

}

echo formatcell("YES UTF8 String Zürich", 25, 'x'); //YES UTF8 String Zürichxxx
echo formatcell("NON UTF8 String Zurich", 25, 'x'); //NON UTF8 String Zurichxxx

Check out my function in my blog http://www.unexpectedit.com/php/php-handling-non-english-characters-utf8

Nacho

Ignacio Pascual