I have a problem with classc ASP / VBScript trying to read an UTF-8 encoded XML file with MSXML. The file is encoded correctly, I can see that with all other tools.
Constructed XML example:
<?xml version="1.0" encoding="UTF-8"?>
<itshop>
<Product Name="Backup gewünscht" />
</itshop>
If I try to do this in ASP...
Set fso = Server.CreateObject("Scripting.FileSystemObject")
Set ts = fso.OpenTextFile("input.xml", FOR_READING)
XML = ts.ReadAll
ts.Close
Set ts = nothing
Set fso = Nothing
Set myXML = Server.CreateObject("Msxml2.DOMDocument.4.0")
myXML.loadXML(XML)
Set DocElement = myXML.documentElement
Set ProductNodes = DocElement.selectNodes("//Product")
Response.Write ProductNodes(0).getAttribute("Name")
' ...
... and Name contains special characters (german umlauts to be specific) the bytes of the umlaut "two-byte-code" get reencoded, so I end up with two totally crappy nonsense characters. What should be "ü" becomes "ü" - being FOUR bytes in my output, not two (correct UTF-8) or one (ISO-8859-#).
What am I doing wrong? Why is MSXML thinking that the input is ISO-8859-# so that it tries to convert it to UTF-8?