tags:

views:

153

answers:

6

for example when i retrieve from database the word program's instead of program's what would be shown is program�s. ' and - changes to �. how can i fix this?

+5  A: 

Is your data stored as UTF-8? Try executing these queries before you fetch any data:

SET NAMES utf8
SET CHARACTER SET utf8

Also make sure you're setting your page encoding:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8" />
karim79
+7  A: 

The replacement character � (U+FFFD) means that your data is not correctly encoded. You are probably declaring your output as UTF-8 but your database data is not UTF-8 encoded. So you need to convert the data to UTF-8. You can use mb_convert_encoding to do that.

Gumbo
A: 

It sounds like a character set issue, your database and web page need to be using the same charset (or you need to convert between them). Check out this article from Joel on Software. That apostrophe would appear to be a curly apostrophe, and my suspicion is the dash is an em dash or similar.

T.J. Crowder
A: 

Could be caused by the use of different characters e.g, Mark’s house and Mark's house, usually I get that character from MS word and other word processors, annoying

Edit: Whops, I see stack overflow auto corrects the character... my point being, word processors create different characters when you hit the apostrophe button

Moak
+4  A: 

This is an issue of character encoding: somewhere along the line the character encoding is being interpreted incorrectly. Here are the areas you should check, working essentially backwards from when you're outputting that page:

When you output the page, you should be including an appropriate content-type header, either from the server (preferable) or in the HTML:

<meta http-equiv="content-type" content="text/html; charset=utf-8" />

When you connect to the database, you should be issuing a SET NAMES query that tells the database what encoding to use when sending data to your script (note that this has nothing to do with what encoding your tables use).

SET NAMES utf8;

Your tables themselves should be defined as the "right" character encoding:

CREATE TABLE foo (...) CHARSET=utf8;

If you've already created the tables using latin1 and wish to convert them, you'll have to not only issue an ALTER TABLE to change the CHARSET but also change the CHARSET for every textual column (TEXT, CHAR, VARCHAR).

If all of that seems correct, the problem may have occurred "on the way in." That is, assuming you have another form that submits these data to the database, you'll need update that form too (Content-Type and SET NAMES).

VoteyDisciple
A: 

You are treating cp-1252 as iso-8859-1. A very common mistake. Use utf-8, which is standardised, or validate input to assure that the user gives you valid iso-8859-1, when he says so. In cases where you don't (eg. you get cp-1252), you can transliterate the input to iso-8859-1.

troelskn