This is a charset issue. As such, it can have gone wrong on many different levels, but most likely, the strings in your database are utf-8 encoded, and you are presenting them as iso-8859-1. Or the other way around.
The proper way to fix this problem, is to get your character-sets straight. The simplest strategy, since you're using PHP, is to use iso-8859-1 throughout your application. To do this, you must ensure that:
- All PHP source-files are saved as iso-8859-1 (Not to be confused with cp-1252).
- Your web-server is configured to serve files with
charset=iso-8859-1
- Alternatively, you can override the webservers settings from within the PHP-document, using
header
.
- In addition, you may insert a meta-tag in you HTML, that specifies the same thing, but this isn't strictly needed.
- You may also specify the
accept-charset
attribute on your <form>
elements.
- Database tables are defined with encoding as latin1
- The database connection between PHP to and database is set to latin1
If you already have data in your database, you should be aware that they are probably messed up already. If you are not already in production phase, just wipe it all and start over. Otherwise you'll have to do some data cleanup.
A note on meta-tags, since everybody misunderstands what they are:
When a web-server serves a file (A HTML-document), it sends some information, that isn't presented directly in the browser. This is known as HTTP-headers. One such header, is the Content-Type
header, which specifies the mimetype of the file (Eg. text/html
) as well as the encoding (aka charset).
While most webservers will send a Content-Type
header with charset
info, it's optional. If it isn't present, the browser will instead interpret any meta-tags with http-equiv="Content-Type"
. It's important to realise that the meta-tag is only interpreted if the webserver doesn't send the header. In practice this means that it's only used if the page is saved to disk and then opened from there.
This page has a very good explanation of these things.