tags:

views:

71

answers:

5

I am learning PHP programming, so I have setup testing database and try to do various things with it. So situation is like that:

Database collation is utf8_general_ci.

There is table "books" created by query

create table books
(  isbn char(13) not null primary key,
   author char(50),
   title char(100),
   price float(4,2)
);

Then it is filled with some sample data - note that text entries are in russian. This query is saved as utf-8 without BOM .sql and executed.

insert into books values
  ("5-8459-0046-8", "Майкл Морган", "Java 2. Руководство разработчика", 34.99),
  ("5-8459-1082-X", "Кристофер Негус", "Linux. Библия пользователя", 24.99),
  ("5-8459-1134-6", "Марина Смолина", "CorelDRAW X3. Самоучитель", 24.99),
  ("5-8459-0426-9", "Родерик Смит", "Сетевые средства Linux", 49.99);

When I review contents of created table via phpMyAdmin, I get correct results.

When I retrieve data from this table and try to display it via php, I get question marks instead of russian symbols. Here is piece of my php code:

<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
    <title>Books</title>
</head>
<body>
<?php
  header("Content-type: text/html; charset=utf-8");
  mysqli_set_charset('utf8');
  @ $db = new mysqli('localhost', 'login', 'password', 'database');

  $query = "select * from books where ".$searchtype." like '%".$searchterm."%'";
  $result = $db->query($query);
  $num_results = $result->num_rows;

  for ($i = 0; $i < $num_results; $i++) {
     $row = $result->fetch_assoc();
     echo "<p><strong>".($i+1).". Title: ";
     echo htmlspecialchars (stripslashes($row['title']));
     echo "</strong><br />Author: ";
     echo stripslashes($row['author']);
     echo "<br />ISBN: ";
     echo stripslashes($row['isbn']);
     echo "<br />Price: ";
     echo stripslashes($row['price']);
     echo "</p>";
  }
...

And here is the output:

1. Название: Java 2. ??????????? ????????????
Автор: ????? ??????
ISBN: 5-8459-0046-8
Цена: 34.99

Can someone point out what I am doing wrong?

+1  A: 
  1. Try to set output charset:

    SET NAMES 'utf-8' SET CHARACTER SET utf-8

  2. Create .htaccess file:

    AddDefaultCharset utf-8 AddCharset utf-8 * CharsetSourceEnc utf-8 CharsetDefault utf-8

  3. Save files in UTF-8 without BOM.

  4. Set charset in html head.
Alexander.Plutov
Most of your recommendation are irrelevant to particular question. You are just repeating some mantras, without understanding of it's meaning.
Col. Shrapnel
A: 

Try to put also in the HTML document Head the meta tag:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

this is different to the HTTP header header("Content-type: text/html; charset=utf-8");

maid450
this can help nothing
Col. Shrapnel
+1  A: 

You probably need to call mysqli_set_charset('utf8'); after you set up your connection with new mysqli(...) as it works on a link rather than a global setting.

so..

@ $db = new mysqli('localhost', 'login', 'password', 'database');
mysqli_set_charset($db, 'utf8');
$query = "select * from books where ".$searchtype." like '%".$searchterm."%'";

By the way, that query seems to be open to SQL-injection unless $searchterm is sanitized. Just something to keep in mind, consider using prepared statements.

And using @ to suppress errors is generally not recommended, especially not during development. Better to deal with error-conditions.

Alexander Sagen
Thanks for advice. I check my input data and error conditions, I just cit them out here, trying to keep my question text to minimum.
Denis
@Denis seeing numerous errors in your code, I seriously doubt you do it properly. Most likely you just *believe* that you check my input data and error conditions.
Col. Shrapnel
Although it's impossible to use prepared statements for $searchtype, +1 for mentioning @.
Col. Shrapnel
@Col. Ah ofcourse, thanks, read the query too quick :)
Alexander Sagen
+1  A: 

After your mysql_connect, set your connection to UTF-8 :

mysql_query("SET NAMES utf8");

Follow Alexander advices for .htaccess, header and files encoding

Feeloow
Thanks, these queries did the job. I'll create .htaccess too. Header and file encoding were already correct, i just forgot to mention them.
Denis
edited your answer because to repeat the same command three times has no sense
Col. Shrapnel
This isn't the best approach. According to the PHP manual for mysqli_set_charset, "This is the preferred way to change the charset. Using mysqli::query() to execute SET NAMES .. is not recommended.". He just called mysqli_set_charset too soon.
Alexander Sagen
@sagen and wrong format. I doubt such a call will alter anything. And you have it in your answer too ;)
Col. Shrapnel
@Col Oh my yes.. I miss the liberal mysql_* functions now...
Alexander Sagen
+1  A: 

Can someone point out what I am doing wrong?

Yes, I can.
You didn't tell Mysql server, what data encoding you want.
Mysql can supply any encoding in case your page encoding is different from stored data encoding. And recode it on the fly.
Thus, it needs to be told of client's preferred encoding (your PHP code being that database client).
By default it's latin1. Thus, because there is no such symbols in the latin1 character table, question marks being returned instead.

There are 2 ways to tell mysql what encoding we want:

  • a slightly more preferred one is mysqli_set_charset() function (method in your case).
  • less preferred one is SET NAMES query.

But as long as you are using mysqli extension properly, doesn't really matter. (though you aren't)

Note that in mysql this encoding is called utf8, without dashes or spaces.

Col. Shrapnel