tags:

views:

135

answers:

2

In reading this recent question about an unhandled XmlException, I tried to reproduce it in both a .NET 2.0 and 3.5 console application.

However in my code it behaves exactly as expected, the XmlDocument.Load method throws an XmlException because the source xml file contains a NULL character.

So, why does the Load statement in the following code (from that example), not throw an XmlException? Even more to the point, why is the XmlException not handled by the valid try block surrounding the SelectNodes() method call?

While I am guessing there may be some sort of lazy loading / caching going on internally, isn't this sort of behavior very unintuitive and confusing?

(The earlier question clearly shows a screenshot of the debugger complaining that SelectNodes() has thrown an XmlException but that it is unhandled???)

    XmlDocument xDoc = new XmlDocument();
    xDoc.Load(File.FullName);

    //work through each print batch in this queue file
    try
    {
        // This line throws an XmlException but is not handled by the catch!
        XmlNodeList nodeList = xDoc.SelectNodes("Reports/PrintBatch");

        foreach (XmlNode printBatch in nodeList)//xDoc.SelectNodes("Reports/PrintBatch"))
        {
            PrintBatch batch = new PrintBatch();
            batch.LoadBatch(printBatch, File.Extension);
            this.AddBatch(batch);
        }
    }
    catch (XmlException e)
    {
        //this report had an error loading!
        Console.WriteLine(e.Message);
    }
+1  A: 

There could be many reasons why you get an exception while he doesn't, which are most likely related to the location of the NULL character. According to his stack, his Null character appears to be at the end of the XML, at position 115227. It could be that the text before it is just valid XML and that an additional NULL character was added to the end of the file by accident. Where do you have your NULL character?

Or, his NULL character is located inside an attribute or element and is considered to be part of the text. It might also depend on the XML being UTF-8, UTF-16 or another encoding type. There are too many variables to consider.


When the NULL character is on the end, the whole file just happens to be a nice, null-terminated string. Still, as you say, it's weird that it's considered to be an unhandled exception while it's inside a try-except block...

There's some interesting reading here about catching unhandled exceptions, but it doesn't explain why they happen.

But if I have to guess... Behind the XML class there's a bunch of unmanaged code. Because of the NULL character, this unmanaged code becomes confused and will create an error when it's released. The call to SelectNodes() will trigger a validation and it discovers the error, thus it's raised. The system starts to process the exception handler but it tries to free xDoc first, because it's not used inside or after the exception block. This frees the unmanaged code but the unmanaged code is still confused thus it reraises an exception again. This would prevent the Catch to handle the exception. You could test this by adding a second xDoc.Load() after the Catch statement, which would prevent xDoc from being freed before the Catch.

Still, this is just a guess... Seems a .NET bug to me.

Workshop Alex
Tested that on a 120K file. A NULL inside the XML throws, NULL at the end of the document does not cause any problems in Load() or SelectNodes(). But the real question is why XmlException is considered unhandled when SelectNodes() is called **within** a try block that explicitly handles XmlException?
Ash
+2  A: 

The exception is always thrown by XmlDocument.Load as expected.

It's just that sometimes the debugger gets the line number wrong. In my experience, the next line of code incorrectly being highlighted as the thrower of an exception is not uncommon.

You can see this in the screenshots: the ASP error page correctly shows that XmlDocument.Load is the thrower, NOT the SelectNodes statement.

Wim Coenen
You look to be correct. I checked the implementation of Load and SelectNodes in Reflector and it appears only Load calls XmlLoader.LoadNode() (as shown in the stack trace). Therefore the debugging symbols may be out of sync with the code shown.
Ash