views:

32

answers:

2

I am trying to store a list of countries in a mySQL database.

I am having problems storing (non English) names like these:

  • São Tomé and Príncipe
  • República de El Salvador

They are stored with strange characters in the db, (and therefore output strangely in my HTML pages).

I have tried using different combinations of collations for the database and the MySQL connection collation:

The "obvious" setting was to use utf8_unicode_ci for both the databse and the connection information. To my utter surprise, that did not solve the problem.

Does anyone know how to resolve this issue?

[Edit]

It turns out the problem is not to do with collation, but rather encoding, as pointed out by the col. I notice that at the command line, I can type two separate commands:

SET NAMES utf8

followed by

[QUERY]

where [QUERY] is my SQL statment to retrieve the names, and that works (names are no longer mangled). However, when I do the same thing programatically (i.e. through code), I still get the mangled names. I tried combining the two statements like this:

SET NAMES utf8; [QUERY]

at the command line, again, this returned the correct strings. Once again, when I tried the same statements through code, I got wrong values.

This is a snippet of what my code looks like:

$mysqli = self::get_db_connection();
$mysqli->query('SET NAMES utf8');

$sql = 'SELECT id, name FROM country';

$results = self::fetch($sql);

the fetch method is:

    private static function fetch($query)
    {
        $rows = array();

        if (!empty($query))
        {
            $mysqli = self::get_db_connection();

            if ($mysqli->connect_errno)
            {
                self::logError($mysqli->connect_error);
            }
            else
            {
                if ($result = $mysqli->query($query))
                {
                    if(is_object($result)){
                        while ($row = $result->fetch_array(MYSQLI_ASSOC))
                            $rows[] = $row;
                        $result->close();
                    }
                }
            }
        }
    return $rows;
    }

Can anyone spot what I may be doing thats wrong?

Just to clarify, the HTTP headers in the page are set correctly 'Content-type': 'text/html; charset=utf-8'

so thats not the issue here.

+1  A: 

if you use uf8 everywhere*, it will work - seems like you forgot anything

*everywhere means: for your database-collation and -connection, for your (php?) script files and for the pages that are sent to the browser (by setting a meta-tag or, better, set an uftf-8-header)

oezi
I would throw in "accept-charset='UTF-8'" in the form tag as well.
aioobe
no need, @aioobe . Page encoding set by header is enough.
Col. Shrapnel
ok, so if I send the page in utf8, the browser is guaranteed to reply in possible post-data in utf8?
aioobe
+1  A: 

As a matter of fact, collation affects nothing of a kind. it's a thing used for ordering and comparison, not recoding.
It is encoding responsible for the characters itself.

So, your problem comes not from the table collation but from the connection encoding

SET NAMES utf8

query should solve the problem, at leas for the newly inserted data

Col. Shrapnel
@col: ok, now were getting warm ...
morpheous