tags:

views:

20

answers:

1

I have nearly completed the task of overhauling my web app to be properly "UTF-8 aware". I have found, though, that if I set the connection character set to utf8 using mysqli_set_charset, the result is that output appears incorrectly (indeed it appears as though the page's character encoding had been misidentified), whereas if I do not set the connection character set, it appears correctly.

For example, a string stored in one table in my database - the column's character set is utf8 - is echoed properly as Página principal if I do not set the connection character set. If I do set the connection character set, it appears as Página principal.

Details: The PHP scripts I am using to test this behaviour have the following meta tag in the <head> section:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

I have determined that the default character set for new connections on my host is latin1. The database I am connecting to has default character set utf8. Here is the code used to create the database link:

$cxn = @mysqli_connect('localhost', $db_user, $db_password, $db_database) or die('Failed to connect to the database.');
mysqli_set_charset($cxn, 'utf8');
mysqli_query($cxn, 'SET SESSION sql_mode = \'TRADITIONAL\'');

Additionally: In case it should serve as some extra forensic evidence, I have found that if I view the page in Firefox and manually change the character encoding to ISO-8859-1, the aforementioned string appears as Página principal.

+1  A: 

This probably results from the fact that the character encoding was different when the data was inserted, and thus might be stored in the wrong encoding (is table encoding set to utf8?). Check if freshly inserted data returns fine. (Aka data which was inserted with a utf8 connection)

Robus
It appears that it is as you say. I had thought that I'd fixed my mis-encoded stored data, but it would appear I was mistaken and it is indeed still mis-encoded. :(
Hammerite