The preceding post already explains, why your example did not work as expected.
However, there are some good coding practices when working with databases, which are important to improve the security of your application (i.e. prevent SQL-injection).
The following example intends to show some of these practices, and assumes PHP 5.2 and MySQL 5.1. (Note that all files and database entries are stored using UTF-8 encoding.)
The database used in this example is called test
, and the table was created as follows:
CREATE TABLE `test`.`entries` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`data` VARCHAR( 100 ) NOT NULL
) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_bin
(Note that the encoding is set to utf8_bin
.)
It follows the php code, which is used for both, adding new entries and creating JSON:
<?
$conn = new PDO('mysql:host=localhost;dbname=test','root','xxx');
$conn->exec("SET NAMES 'utf8'"); // Enable UTF-8 charset for db-communication ..
if(isset($_GET['add_entry'])) {
header('Content-Type: text/plain; charset=UTF-8');
// Add new DB-Entry:
$data = $conn->quote($_GET['add_entry']);
if($conn->exec('INSERT INTO `entries` (`data`) VALUES ('.$data.')')) {
$id = $conn->lastInsertId();
echo 'Created entry '.$id.': '.$_GET['add_entry'];
} else {
$info = $conn->errorInfo();
echo 'Unable to create entry: '. $info[2];
}
} else {
header('Content-Type: text/json; charset=UTF-8');
// Output DB-Entries as JSON:
$entries = array();
if($res = $conn->query('SELECT * FROM `entries`')) {
$res->setFetchMode(PDO::FETCH_ASSOC);
foreach($res as $row) {
$entries[] = $row;
}
}
echo json_encode($entries);
}
?>
Note the usage of the method $conn->quote(..)
before passing data to the database. As mentioned in the preceding post, it would even be better to use prepared statements, since they already do the whole escaping. Thus, it would be better if we write:
$prepStmt = $conn->prepare('INSERT INTO `entries` (`data`) VALUES (:data)');
if($prepStmt->execute(array('data'=>$_GET['add_entry']))) {...}
instead of
$data = $conn->quote($_GET['add_entry']);
if($conn->exec('INSERT INTO `entries` (`data`) VALUES ('.$data.')')) {...}
Conclusion: Using UTF-8 for all character data stored or transmitted to the user is reasonable. It makes the development of internationalized web applications way easier. To make sure, user-input is properly sent to the database, using an escape function is a good idea. Otherwise, using prepared statements make life and development even easier and furthermore improves your applications security, since SQL-Injection is prevented.