tags:

views:

313

answers:

2

Hello Everyone,

I have some problems about "charset" in the transformation result with different versions of MSXML.

The code below will transform XML to HTML with MSXML3.0

    Dim xmlDoc As New MSXML2.DOMDocument
    xmlDoc.async = False
    Dim strXML As String
    strXML = "<Results><ElapsedTime>3000</ElapsedTime></Results>"
    xmlDoc.loadXML(strXML)

    Dim xslDoc As New MSXML2.FreeThreadedDOMDocument
    xslDoc.async = False
    Dim strXSL As String
    strXSL = "C:\Test.xsl"
    xslDoc.load(strXSL)

    Dim xslt As New MSXML2.XSLTemplate
    xslt.stylesheet = xslDoc

    Dim xslProc As MSXML2.IXSLProcessor
    xslProc = xslt.createProcessor
    xslProc.input = xmlDoc
    xslProc.transform()

    Debug.Print(xslProc.output)

================================

The content of "Test.xsl" is,

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
  <xsl:template match="Results">
    <html>
      <head>
        <title>Report</title>
      </head>
    </html>
  </xsl:template>
</xsl:stylesheet>

===============================

The output is,

<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-16">
<title>Report</title>
</head>
</html>

I'm not sure why the charset is always set as "UTF-16" with MSXML3.0

=========================

Then I change code to use MSXML4, like this,

Dim xmlDoc As New MSXML2.DOMDocument40
...
Dim xslDoc As New MSXML2.FreeThreadedDOMDocument40
...
Dim xslt As New MSXML2.XSLTemplate40
...

=====================

This time, the output is,

<html>
<head>
<META http-equiv="Content-Type" content="text/html">
<title>Report</title>
</head>
</html>

No charset is output in MSXML4.0.

=====================

Can you please tell me which one is right? Why the differences happens?

A: 

Well, in general MSXML sucks at getting the right character encoding. For some frekking reason they've chosen UTF-16 as default charset.

But you could try to add this line right after the xsl:stylesheet line:

<xsl:output method="html" version="1.0" encoding="utf-8" indent="yes" omit-xml-declaration="yes" doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" />
Dammark
Thank you, but it doesn't work.
RogerCui
A: 

I have the same problem. Except adding the output-line (i had it already in my xsl), nothing changed.

Does anybody have an idea how to inform msxml (I use XslTransform via .Net) which encoding is used?

Thanks in advance

dasheddot