views:

109

answers:

4

Our website was developed with a meta tag set to...

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />

This works fine for M-dashes and special quotes, etc. However, I have an issue when data has been entered into a CMS component that stores data in MySQL. The MySQL collation is set to UTF8_swedish_ci (I read this is ok and must have been a default when it was set up in phpMySqlAdmin).

The problem I now get is when I output info from the DB to the page, the characters are utf8 encoded, so I run them through the uft8_decode() php function. I thought this would fix the incompatibility, but what I'm getting isn't what I expect.

When I look at the data in the DB in a text field (again through phpMySqlAdmin) it looks like this...

This – That

When I view it on the screen it looks like...

This ? That

I know I can try to find/replace a bunch of these in the DB or the text, but I'm hoping there's an easier way to do this programatically.

Thanks, Don


Update:

Still have an issue that htmlentities() unfortunately doesn't fix.

I have text in a file like this: we’ve (special '). My MySQL collation is "latin1_swedish_ci" (the default). If I change the header or meta to either iso/utf one or the other breaks. W/ utf-8 the (’) a black diamond but the db content is fine. With iso, the inline content is ok, but the content from the db has all kinds of  and other chars. Tried changing MySQL collation to utf-8 but didn't see a difference.

I'm about resolved to changing the items manually. Thanks for any other suggestions.

A: 

My guess would be that despite you meta tag, the web server sends a header which sets the charset to UTF-8. However, the easiest way to fix these kinds of problems is usually to escape non-ASCII-characters to HTML entities.

Magnus Nordlander
A: 

If your data in the database is UTF8, you'll need to run this query after you connect to MySQL:

SET NAMES UTF8
DisgruntledGoat
A: 

One of the ways of fixing this is to try using htmlentities ( http://us.php.net/manual/en/function.htmlentities.php ) to sanitize the output.

Alex N.
+1  A: 

Assuming that you were able to set the encoding properly in your database, my recommended approach here is to:

  • Make sure that the Content-Type header has been set properly by the server. This can be done in php by using the header function.

    header('Content-Type: text/html; charset=iso-8859-1');

Note that this takes precedence and is the easiest information to get since user agents do not have to parse it.

  • Set the meta tag in the HTML file.

    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>

For further readings, refer to:

http://www.joelonsoftware.com/articles/Unicode.html

http://www.webstandards.org/learn/articles/askw3c/dec2002/

Mark Basmayor