views:

181

answers:

1

Using the concepts from the sample code provided by Microsoft for loading HTML content into an IWebBrowser from an IStream using the web browser's IPersistStreamInit interface:

pseudocode:

void LoadWebBrowserFromStream(IWebBrowser webBrowser, IStream stream)
{
   IPersistStreamInit persist = webBrowser.Document as IPersistStreamInit;
   persist.Load(stream);
}

How can one specify the encoding of the html inside the IStream? The IStream will contain a series of bytes, but the problem is what do those bytes represent? They could, for example, contain bytes where:

  • each byte represents a character from the current Windows code-page (e.g. 1252)
  • each byte could represent a character from the ISO-8859-1 character set
  • the bytes could represent UTF-8 encoded characters
  • every 2 bytes could represent a character, using UTF-16 encoding

In my particular case, i am providing the IWebBrowser an IStream that contains a series of double-bytes characters (UTF-16), but the browser (incorrectly) believes that UTF-8 encoding is in effect. This results in garbled characters.

Workaround solution

While the question asks how to specify the encoding, in my particular case, with only UTF-16 encoding, there's a simple workaround. Adding the 0xFEFF Byte Order Mark (BOM) indicates that the text is UTF-16 unicode. ie then uses the proper encoding and shows the text properly.

Of course that wouldn't work if the text were encoded, for example with:

  • UCS-2
  • UCS-4
  • ISO-10646-UCS-2
  • UNICODE-1-1-UTF-8
  • UNICODE-2-0-UTF-16
  • UNICODE-2-0-UTF-8
  • US-ASCII
  • ISO-8859-1
  • ISO-8859-2
  • ISO-8859-3
  • ISO-8859-4
  • ISO-8859-5
  • ISO-8859-6
  • ISO-8859-7
  • ISO-8859-8
  • ISO-8859-9
  • WINDOWS-1250
  • WINDOWS-1251
  • WINDOWS-1252
  • WINDOWS-1253
  • WINDOWS-1254
  • WINDOWS-1255
  • WINDOWS-1256
  • WINDOWS-1257
  • WINDOWS-1258
A: 

IE's document supports IPersistMoniker loading too. IE uses URL monikers for downloading. You can replace the url moniker created by CreateURLMonikerEx with your own moniker. A few details about URL moniker's implementation can be find here. See if you can get IHTTPNegotiate from the binding context when your BindToStroage implemetation is called.

Sheng Jiang 蒋晟
i don't seen to be able to proceed without knowing what a moniker is, and why i would want one. All the linked resources assume i know what it is, or what it is for.
Ian Boyd
See Monikers (COM Fundamentals) at http://msdn.microsoft.com/en-us/library/ms691261(VS.85).aspx
Sheng Jiang 蒋晟