views:

249

answers:

1

My program is fetching messages from a database, which contains English, German and several Eastern European languages. My Python script sets the encoding via:

<meta  http-equiv="Content-Type" content="text/html; charset=utf-8"/>

and use the values fetched correctly from the database (if I check within my logs).

Unfortunately all browsers I tested (IE8, Firefox 3.0.10, Opera 9.64) switch based on my local language settings to:

  • Western ISO-8859-1 in Firefox
  • Western European (Windows) in IE
  • Automatic in Opera

Everything works fine as soon as I switch the character encoding manually in the browser.

The same happens if I manually generate the HTML file using UTF-8 (tested with TextMate respective jEdit), although both editors display the content correctly.

That works fine for English and German, but i.e. not for Russian. How can I force the "correct" character encoding?

ANSWER

The following entry within the VirtualHost (Apache configuration) section did the trick for me:

AddDefaultCharset utf-8

Many thanks for pointing me into the right direction, that helped a lot!

+3  A: 

When the document is transfered over HTTP, the HTTP header information are the crutial information:

[…] conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):

  1. An HTTP "charset" parameter in a "Content-Type" field.
  2. A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
  3. The charset attribute set on an element that designates an external resource.

So make sure you declare the character encoding in the Content-Type header field and not just inside the document.

Gumbo