Is it possible to preserve whitespace inside tags?
I am accessing XML nodes (containing XHTML content) in an XPathDocument using a XPathNodeIterator.
Some of the tags in the nodes are not "strict" XHTML (and this is allowed in the final output of the tool). Some nodes contain image tags without the trailing space.
<img src="filename.png" alt="description"/>
When i store the resulting nodes they get nicely formatted with the trailing space.
<img src="filename.png" alt="description" />
Is it possible to get the node contents, preserving the in-tag spacing (in this case not have the space)? I was thinking about something similar to PreserveWhitespace.
A simplified sample of the code used
xmlDoc = New XPathDocument(fileIn, xmlSpace.Preserve)
xmlNav = xmlDoc.CreateNavigator()
Dim xmlNode As XPathNodeIterator
Dim ns As XmlNamespaceManager = new XmlNamespaceManager(xmlNav.NameTable)
xmlNode = xmlNav.Select("/export/contents[target[@translate='True']]")
While xmlNode.MoveNext()
target = xmlNode.Current.selectSingleNode("target").InnerXML
' ... '
End While
Some background: As Marc pointed out there is no difference in the meaning of the resulting XML with regard to the non-significant whitespace inside the tags (or the tag order for that matter).
The main problem i encounter is that the data comes from a CMS system that handles both new and legacy content. The content creation process just recently moved to XML/XHTML so there is still older non strict XHTML content in the system.
The QA tools used are still mainly text based and build for HTML and are run by another department (the QA process will need to be adjusted/updated). This is why i would like to keep tags as close to the original format as possible for now.
As a temporary work-around i added a few regular expressions (comparing new and previous versions of the nodes) to search for and fix the "differences" introduced by parsing the XML with .NET