



I have an XmlReader that is trying to read text into a list of elements. I am having trouble getting it to reader the text: "a [ z ]". If I try with the text "a [ z ] " (same but with two trailing spaces) it works fine. Below is an example:

TextReader tr = new StringReader("a [ z ]");
XmlReaderSettings settings = new XmlReaderSettings
    ConformanceLevel = ConformanceLevel.Fragment,
    ProhibitDtd = false,
    ValidationType = ValidationType.None,
    XmlResolver = null,
    CheckCharacters = false,
    IgnoreProcessingInstructions = true,
XmlReader reader = XmlReader.Create(tr, settings);

StringBuilder sb = new StringBuilder();

while (!reader.EOF)
    if (reader.NodeType == XmlNodeType.Text || reader.NodeType == XmlNodeType.Whitespace)

// sb.ToString() should be "a [ z ]"

When you run it fails with the message: "System.Xml.XmlException : Unexpected end of file has occurred. Line 1, position 7." and a stack trace:

at System.Xml.XmlTextReaderImpl.Throw(Exception e) 
at System.Xml.XmlTextReaderImpl.ParseText(Int32& startPos, Int32& endPos, Int32& outOrChars)
at System.Xml.XmlTextReaderImpl.FinishPartialValue()
at System.Xml.XmlTextReaderImpl.get_Value()
at LocalisationFormats.Tests.Shared.InlineElements.InlineElementHelperTest.Test()

When you attempt to debug it, the Reader is in a ReadState of "Error" and the Reader.Value is "a [ z", and then you break the reader and get an OutOfMemoryExecption.

Anyone any suggestions?

EDIT: removed extra if block from code snippet on suggestion from Gregoire.

+2  A: 

I believe the problem is that when you are loading a non-Xml formatted string into an XmlReader object.

"XmlReader provides forward-only, read-only access to a stream of XML data. The XmlReader class conforms to the W3C Extensible Markup Language (XML) 1.0 and the Namespaces in XML recommendations." & "XmlReader throws an XmlException on XML parse errors." - MSDN XmlReader Class Article

Try loading and reading actual Xml data instead by changing:

TextReader tr = new StringReader("a [ z ]");


TextReader tr = new StringReader("<node>a [ z ]</node>");

or alternately, if you need each piece in its own node:

TextReader tr = new StringReader("<node>a</node><node> </node><node>[</node><node> </node><node>z</node><node> </node><node>]</node>");

I'm providing complete source for the latter example, because I THINK that's what you're aiming at here.

TextReader tr = new StringReader("<node>a</node><node> </node><node>[</node><node> </node><node>z</node><node> </node><node>]</node>");
XmlReaderSettings settings = new XmlReaderSettings
    ConformanceLevel = ConformanceLevel.Fragment,
    ProhibitDtd = false,
    ValidationType = ValidationType.None,
    XmlResolver = null,
    CheckCharacters = false,
    IgnoreProcessingInstructions = true,
XmlReader reader = XmlReader.Create(tr, settings);

StringBuilder sb = new StringBuilder();

while (!reader.EOF)
    string s = reader.ReadElementString();

    if (s != " ")

This will allow you to iterate through the nodes, getting the full string values with no exceptions.


I thought that setting the XmlReader.ConformanceLevel to Fragment would mean it could parse any well formated XML (see I thought my text was well formated XML (just with out a root node).
Well formatted Xml has to be at LEAST a node, but does not need to follow the single root element rule.

I've checked and this has been fixed in .Net 4, but still broken in .Net 3.5 as of this post.
