tags:

views:

160

answers:

5

Is there a way to get more useful information on validation error? XmlSchemaException provides the line number and position of the error which makes little sense to me. Xml document after all is not about its transient textual representation. I'd like to get an enumerated error (or an error code) specifying what when wrong, node name (or an xpath) to locate the source of the problem so that perhaps I can try and fix it.

Edit: I'm talking about valid xml documents - just not valid against a particular schema!

A: 

personally I'm not sure how to get a more detailed error, typcially f you open the document and go to the location mentioned you can easily find the error.

If the code isn't able to parse the file as valid XML, it is pretty hard for it to give an XPATH or other named XML detail.

Mitchel Sellers
+1  A: 

In my experience, you are lucky to get a line number and parse position.

davetron5000
+1  A: 

You might consider validating via a DTD which can sometimes give slightly more interesting errors, however, on a project I currently work on, we validate using XSLTs. The transform checks the syntax and reports errors as outputted transform text. I would consider that route if you want more friendly error checking. For us, an empty output means no errors, otherwise we get some nice detail from the XSLT processing on what the error was and where.

Jeff Yates
Thanks! You are talking about something like schematron. I am currently doing the same thing but it does not help me with the task of trying to repair the offending xml document at runtime.
Goran
If you write your own XSLT, you will have better luck at repairing. Our XSLTs are two-fold: one for syntax and one for validation (our XML is kind've like scripting). Both parts provide meaningful information that can be used, in some part, to rebuild broken information.
Jeff Yates
A: 

You can accomplish this, sort of, by setting up an XmlReader whose XmlReaderSettings contain the schema and then using it to read through the input stream node by node. You can keep track of the last node read and have a pretty good idea of where you are in the document when a validation error happens.

I think that if you try this exercise, you'll discover that there are a lot of validation errors (e.g. required element missing) where the concept of the error node doesn't make much sense. Yes, the parent element is clearly what's in error in that case, but what really triggered the error was the reader encountering the end tag without ever seeing the required element, which is why the error line and position point at the end tag.

Robert Rossney
That's another thing. The validation messages are unstructured. Is there a comprehensive list of validation messages out there. I'm thinking about writing a wrapper to return something like error enums or error code.
Goran
I've never found one. In practice, I've never needed one. The real use case for schemas isn't "validate an instance document and tell me what's wrong with it," it's "guide me in developing a process that produces valid XML documents."
Robert Rossney
A: 

It seems this is no easy task. Robert Rossney's answer comes closest to programmaticaly solving my problem so I'll accept that for now. I'll continue using the xsl solution. Anyone finding a better way to resolve validation errors can respond to this thread.

Goran