Our project has been converted to use XDocument from XmlDocument few days ago, but we found a strange behavior while processing XML entity in attribute value with XDocument.Parse, the sample code as following:
The XML string:
string xml = @"<char symbol=""�"">";
The XmlDocument.LoadXml code and result:
XmlDocument xmlDocument = new XmlDocument(); xmlDocument.LoadXml(xml); Console.WriteLine(xmlDocument.OuterXml);
Result:
<char symbol="�" />
The XDocument.Parse code and exception:
XDocument xDocument = XDocument.Parse(xml); Console.WriteLine(xDocument.ToString());
Exception:
A first chance exception of type 'System.Xml.XmlException' occurred in System.Xml.dll '.', hexadecimal value 0x00, is an invalid character. Line 1, position 18. at System.Xml.XmlTextReaderImpl.Throw(Exception e) at System.Xml.XmlTextReaderImpl.Throw(String res, String[] args) at System.Xml.XmlTextReaderImpl.Throw(Int32 pos, String res, String[] args) at System.Xml.XmlTextReaderImpl.ParseNumericCharRefInline(Int32 startPos, Boolean expand, StringBuilder internalSubsetBuilder, Int32& charCount, EntityType& entityType) at System.Xml.XmlTextReaderImpl.ParseNumericCharRef(Boolean expand, StringBuilder internalSubsetBuilder, EntityType& entityType) at System.Xml.XmlTextReaderImpl.HandleEntityReference(Boolean isInAttributeValue, EntityExpandType expandType, Int32& charRefEndPos) at System.Xml.XmlTextReaderImpl.ParseAttributeValueSlow(Int32 curPos, Char quoteChar, NodeData attr) at System.Xml.XmlTextReaderImpl.ParseAttributes() at System.Xml.XmlTextReaderImpl.ParseElement() at System.Xml.XmlTextReaderImpl.ParseDocumentContent() at System.Xml.XmlTextReaderImpl.Read() at System.Xml.Linq.XDocument.Load(XmlReader reader, LoadOptions options) at System.Xml.Linq.XDocument.Parse(String text, LoadOptions options) at System.Xml.Linq.XDocument.Parse(String text)
It seems that the "�" is an invalid character, so we change the value to a valid character such as "`" then both methods worked well.
Is there any way to change the XDocument.Parse behavior to ignore the invalid character in attribute like XmlDocument.LoadXml does?