views:

55

answers:

2

Using Delphi 2009 and IXMLDOMDocument2

I receive "An invalid character was found in text content" error when loading XML into IXMLDOMDocument2. The character is 1B (in Hex) and is present within a CDATA section. Microsoft's XML viewer (IE) loads the file just fine. The XML looks like...

<data><child><![CDATA[-- ]]></child></data>

NOTE: I try to paste the XML, but the special character is removed. In the CDATA section there is the 1B character following "-- " in my XML file.

I've tried adding to the start of the XML and other encodings and nothing is working for me. Is there anything that can be done to load this file?

Thanks, Michael

+4  A: 

Character U+001B is not allowed in XML, along with most of the rest of the ASCII control characters. It's not well-formed and if Microsoft's XML viewer doesn't complain, it's not parsing it according to the rules of XML. Tsk!

In XML 1.1 only, all but U+0000 may be included in a document as a character reference like &#x1B;. (Obviously, that's no use in a CDATA section, but then CDATA sections aren't really much use anyway.)

If you need to include arbitrary control characters in XML, you will usually need to use an application-specific encoding scheme such as base64.

bobince
A: 

From some web service that I need to call from some application, I get a lot of &#x0; characters contaminating the XML. To solve this problem, I just load the XML first in a WideString variable, then replace the illegal text by using StringReplace() before adding the XML to an IXMLDocument interface object.
It's dirty, I know. But if you still need to process some XML file that contains illegal characters, this is just the simplest option.

Workshop Alex