views:

414

answers:

4

my data includes arabic characters which looks like garbage in mysql but displays correctly when run on browser. my questions:

  • how do i html-encode the output?
  • if i add this to all my files: <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> i get this error: Error: Incorrect string value: '\xE4\xEE\xC3\xD8\xEF\xE6...' for column 'cQuotes' at row 1

i'm working on php/mysql platform.

insertion form in html:

<!doctype html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Your Favorite Quotes</title>
<link rel="stylesheet" type="text/css" href="style.css" /> 
<link rel="stylesheet" href="css/validationEngine.jquery.css" type="text/css" media="screen" charset="utf-8" />
<script type="text/javascript" src="scripts/jquery-1.4.2.js"></script>
<script src="scripts/jquery.validationEngine-en.js" type="text/javascript"></script> 
<script src="scripts/jquery.validationEngine.js" type="text/javascript"></script>
<script type="text/javascript">
            $(document).ready(function() { 
                $("#submitForm").validationEngine() 
            })  
</script>
</head>
<body>

<div class="container">
<div class="center_div">
<h2>Submit Your Quote</h2>
<fieldset>
<form id="submitForm" action="qinsert.php" method="post">
<div class="field">
<label>Author: </label>
<input id="author" name="author" type="text" class="validate[required,custom[onlyLetter],length[0,100]]">
</div><br />
<div class="field">
<label>Quote: </label>
<textarea id="quote" name="quote" class="validate[required, length[0,1000]]"></textarea>
<br />
</div>
<input id="button1" type="submit" value="Submit" class="submit" /><br />
<input id="button2" type="reset" value="Reset" /> 
</form>
</fieldset>
</div>
</div>


</body>
</html>

//////////////////////

query in php:

//<?php
//header('Content-Type: text/html; charset=UTF-8');
//?>
<!doctype html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link rel="stylesheet" type="text/css" href="style2.css" /> 
<title>Your Quote Databank</title>
</head>
<body>

<?php
include 'config.php';

echo "Connected <br />";


//check for quotes and apostrophes

$author = '';
$quote = '';

$author = $_POST['author'];
$quote = $_POST['quote'];

$author = mysql_real_escape_string(trim($author)); 
$quote = mysql_real_escape_string(trim($quote)); 


//**************************


//validating data

$query = "SELECT * FROM Quotes where cQuotes = '$quote' limit 1;";

$result = mysql_query($query, $conn);

//now check that the number of rows is 0

if (mysql_num_rows($result) > 0 ) {
header("Location: /error.html");
exit;
}



//inserting data
//mysql_query("SET NAMES 'utf8'");
//mysql_query("SET CHARACTER SET utf8");
$sql="INSERT INTO Quotes (vauthor, cquotes)
VALUES ('$author', '$quote')";

if (!mysql_query($sql,$conn))
  {
  die('Error: ' . mysql_error());
  }
echo "<div class='container'><p><label class='lbl_record'> Record Added Successfully!</label>";
echo "<a href='qform.html'> Submit a New Quote!</a></p>";

//**************************


//selecting data
$result = mysql_query("SELECT * FROM Quotes ORDER BY idQuotes DESC");

echo "<div class='center_div'>";
echo "<table>
<thead>
<tr>
<th>Author</th>
<th>Quotes</th>
</tr>
</thead>";

while($row = mysql_fetch_array($result))
  {
  echo "<tbody><tr>";
  echo "<td width='150px'>" . $row['vAuthor'] . "</td>";
  echo "<td>" . $row['cQuotes'] . "</td>";
  echo "</tr>";
  }
echo "</tbody></table>";
echo "</div></div>";
//**************************

include 'close_config.php';

?>

</body>
</html>
A: 

You do not need to use HTML entities. You need to pick a character encoding that's suitable for Arabic characters (such as UTF-8) and make use all the intermediate tools use it properly: MySQL, mysql_query(), PHP, HTML...

Álvaro G. Vicario
Yup, that's the issue as well (see http://stackoverflow.com/questions/2595915/error-in-arabic-script-in-mysql/2595940#2595940)
Pekka
+2  A: 

there is no such thing "characters". there is encoding only. You must determine encoding you're using. And then set this encoding for the all three parts of your application:

  • HTML page
  • PHP script
  • database

UTF-8 is strongly recommended.
To use it, you must set your database encoding, by creating tables with default charset=utf8, mysql client encoding, by executing SET NAMES utf8 query in your client side application, and by setting header("Content-Type: text/html; charset=utf-8"); for the page that contain html form and for the page with results

According to your edition:
Add this line in your config.php file:

mysql_query("SET NAMES utf8");

And also check mysql tables encoding. You can do it with show create table Quotes query

Col. Shrapnel
thanks for this. running this query, it outputted that the default charset for the table that i'm using is latin1. how do i change it to utf8?
fuz3d
secondly, if remove the header or meta tag from my pages, the data inserts and displays correctly; except for the fact that in mysql, it shows random characters.
fuz3d
@fusion if your tables contain no data, you better re-create tables with utf-8 charset. Otherwise, it could be hard to recode, but I have an answer here on SO with instructions.
Col. Shrapnel
for the second, you won't be able to search and order data properly. So, proper encoding must be set. One can't work with a system with wrong settings.
Col. Shrapnel
@Col. Shrapnel, i re-created the table with the default charset utf-8. i added the meta tag with the encoding on the html page as well as the php script. yet it throws this error when i try to insert the data: Error: Incorrect string value: '\xE4\xEE\xC3\xD8\xEF\xE6...' for column 'cquotes' at row 1
fuz3d
@fusion 1. SET NAMES utf8 after mysql_connect. 2. not a meta tag, which means nothing, but HTTP header.
Col. Shrapnel
A: 

Try this:

while($row = mysql_fetch_array($result))
{
  echo "<tbody><tr>";
  echo "<td width='150px'>" . htmlentities($row['vAuthor']) . "</td>";
  echo "<td>" . htmlentities($row['cQuotes']) . "</td>";
  echo "</tr>";
}

for more details see htmlentities function ref

jerjer
that's wrong answer
Col. Shrapnel
**Don't** use `htmlentities` without the `$charset` argument. It will incorrectly encode all your bytes as if they were ISO-8859-1. In fact don't use `htmlentities` at all, there is no point. `htmlspecialchars`.
bobince
even with this parameter it's totally absolete in the word of unicode.
Col. Shrapnel
It could potentially be useful for throwing HTML down a 7-bit channel (like mail, where there are still environments that get charsets wrong), but yes, for normal web pages you don't ever want `htmlentities`.
bobince
@bobince just for your info: "mail environments" has it's own standard to encode non-7bit characters. And HTML is not he only possible email format. So, there is no single reason to use entities in email. Go figure
Col. Shrapnel
Yes, but unfortunately it doesn't always work. Try sending UTF-8 to East Asian webmail services and mobile phones, even with the correct MIME headers. It's sad, but there are still places where encodings don't work.
bobince
@bobince not headers I am talking about, but encodings. quoted-printable is used everywhere. when you send an e-mail, you send plain text, not HTML in first place. And there is no use for html entities in the plain text letter.
Col. Shrapnel
+2  A: 
Incorrect string value: '\xE4\xEE\xC3\xD8\xEF\xE6...'

That looks like an Arabic string encoded in ISO-8859-6. You will get this error if you have received an ISO-8859-6 byte string and are attempting to insert it into a UTF-8 database table.

Your script should not have received ISO-8859-6 from the browser, if your form page is correctly marked up as UTF-8 as the meta would imply. Check in the browser that when the form is displayed, the View->Encoding menu has ‘UTF-8’ ticked. The <meta> might be overridden by the web server passing back a real Content-Type: text/html;charset=... header.

This could also possibly happen if your PHP is trying to use the wrong charset to talk to the server. I see you've commented out a SET NAMES... I'd use mysql_set_charset('utf8'); in preference.

echo "<td>" . $row['cQuotes'] . "</td>";

You need to HTML-encode the output here but not because of charset issues. Any < and & characters need encoding in the text otherwise they can inject unwanted HTML markup, including JavaScript, in which case you have cross-site-scripting problems.

echo '<td>'.htmlspecialchars($row['cQuotes']).'</td>';

Aside: ...however, PHP is a templating language. Use it, don't fight it by trying to do string templating yourself.

<table>
    <?php while ($row= mysql_fetch_array($result)) { ?>
        <tr>
            <td width="150"><?php h($row['vAuthor']) ?></td>
            <td><?php h($row['cQuotes']) ?></td>
        </tr>
    <?php } ?>
</table>

(width="150px" doesn't work, px is for CSS only.) The above assumes a helper function like this to stop you having to type htmlspecialchars so much:

function h($s) {
    echo htmlspecialchars($s, ENT_QUOTES);
}
bobince
thank you so much! that worked. however, there is one problem. when i insert it says that the data was inserted successfully, but it displays blank rows on the browser with no data in them. on the other hand, mysql displays the data that was entered.
fuz3d
On newly-inserted rows only, or always?
bobince
always. it displays as blank rows. http://img140.imageshack.us/img140/6490/errorba.jpg
fuz3d
Can you post the PHP code you're using? It looks like some kind of error prevents the values being output (is error reporting turned on?).
bobince
it's working now. thanks. instead of 'select * from quotes', i specified the specific fields that i wanted.
fuz3d