tags:

views:

979

answers:

2

You can verify an XPath expression against an XML doc to verify it, but is there an easy way to verify the same XPath expression against the schema for that document?

Say I have an XSD schema like this:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" ... etc>
  <xsd:element name="RootData">
    <xsd:complexType>
      <xsd:sequence minOccurs="0">
        <xsd:element name="FirstChild">
          <xsd:complexType>
            <xsd:sequence minOccurs="0">
              <xsd:element name="FirstGrandChild">
... etc etc

Is there an easy or built-in way to verify that the XPath:

/RootData/FirstChild/FirstGrandChild

would be valid against any XML documents that may be based on that schema? (Edit: I guess I mean potentially valid; the actual XML document might not contain those elements, but that XPath could still be considered potentially valid for the schema. Whereas, say, /RootData/ClearlyInvalidChild/ThisElementDoesntExistEither is clearly invalid.)

Of course I could only expect this to work against canonical XPath expressions rather than ones of arbitrary complexity, but that's fine.

I'm specifically thinking in .NET but am curious if other implementations make it possible. It's not so important that I want to roll my own, for example I don't really want to write my own code to transform that XPath expression into another one like:

/xsd:schema/xsd:element[@name='RootData']/xsd:complexType/xsd:sequence/xsd:element[@name='FirstChild']/...etc...

... though I know I could do that if I really had to.

Cheers!

+1  A: 

At design time, you could use a tool to generate a sample XML document and execute your XPath against the sample. Altova XML Spy has this feature, as does SOAP UI.

SOAP UI is actually open source (Java) so maybe you can take a peek and see how it generates the samples. In a runtime situation (i.e. if schema and XPath are both inputs to a running program) then you'd have to ensure enough optional components and sample data was generated to avoid false negatives and may need to generate multiple example files.

I wouldn't try to evaluate the XPath against the schema directly as the various Axes would make a complete solution very complicated. I'm pretty sure that could be done, but it strikes me as hard core mathematics. I propose generating samples as a short-cut.

Simon Gibbs
+1  A: 

We actually did a research project on this, and implemented an XPath verifier, sometime around 2000. This was for XPath 1. I am not aware of any currently available libraries that you can use to do this.

If you want to go and implement this yourself, here are some hints:

  • You will not be able to transform a path over an instance document into a path over a schema as you do above. For example, /a//b does not transform into /xsd:element[@name='a']//xsd:element[@name='b'] because element b may be defined at the top level of the schema, not underneath b.

  • Remember that while an XML document is a tree, a schema is a graph. If you search descendant paths like //a, you will have to decide when to terminate the search or it may continue forever (e.g. imagine in a element "a" that contains "b", which contains "a")

  • Some paths will be undecidable or at least very hard to decide. For example //*[starts-with(@name, 'foo')]

If you're still up for it, I suggest using a library like eclipse's XSD or the .NET schema loading classes to load the schema into memory and do your checking in code.

xcut