I'm trying to prepare a demo html page with mixed english and arabic content. Basically it contains a small table with english phrases on the left, and the arabic translation on the right side.
Because I don't understand arabic, I took the first three characters of the arabic alphabet from the Unicode reference.
First attempt, using the character entities (ا ب ت): it works (display: ا ب ت).
The I tried to enter the arabic characters directly in the document. To enable this, I saved the document as UTF-8 and added the meta tag for the content type.
Displaying this document in Internet Explorer (7) shows garbage: ا ب ت
Manually switching IE to use UTF-8 (Menu "View -> View -> Unicode") makes IE show the characters correclty. But as soon as the document gets reloaded, the garbage appears again.
<html>
<head>
<meta content="content-type" content="text/html; charset=utf-8">
</head>
<body>
<table width="95%" border="1">
<colgroup><col width="50%" /><col width="50%" /></colgroup>
<tbody>
<tr>
<th>English</th><th>Arabic</th>
</tr>
<tr>
<td>Test phrase</td>
<td dir="rtl">ا ب ت</td>
</tr>
</tbody>
</table>
</body>
</html>
Testing with Firefox shows the correct arabic letters. (But the interpretation of the direction "rtl" is different: IE show the text right aligned, Firefox left aligned.)
Any hints how to convince IE to use the encoding given in the document?
Is this an effect of locally stored html files? When editing this StackOverflow entry, I observe
- the arabic characters are rendered as expected,
- the encoding in the menu automatically switches to "Unicode (UTF-8)",
- and the source of html does not contain the meta tag for the content type.