I'm trying to convert a string from iso-8859-1 to utf-8. But when I find these two charachter € and • the function returns a charachter that is a square with two number inside.
How can I solve this issue?
I'm trying to convert a string from iso-8859-1 to utf-8. But when I find these two charachter € and • the function returns a charachter that is a square with two number inside.
How can I solve this issue?
iso-8859-1 doesn't contain the € sign so your string cannot be interpreted with iso-8859-1 if it contains it. Use iso-8859-15 instead.
Those 2 characters are illegal in iso-8859-1 (did you mean iso-8859-15?)
$ php -r 'echo iconv("utf-8","iso-8859-1//TRANSLIT","ter € and • the");'
ter EUR and o the
I think the encoding you are looking for is Windows code page 1252 (Western European). It is not the same as ISO-8859-1 (or 8859-15 for that matter); the characters in the range 0xA0-0xFF match 8859-1, but cp1252 adds an assortment of extra characters in the range 0x80-0x9F where ISO-8859-1 assigns little-used control codes.
The confusion comes about because when you serve a page as text/html;charset=iso-8859-1
, for historical reasons, browsers actually use cp1252 (and will hence submit forms in cp1252 too).
iconv('cp1252', 'utf-8', "\x80 and \x95")
-> "\xe2\x82\xac and \xe2\x80\xa2"