views:

45

answers:

2

Hi,

I need to come up with some kind of rules engine, where I can specify validation rules for specific xml files. Then the rules are read in and asserted against the xml.

Whats the best way? Is it XPath, or XQuery, or XSL, or XSD even? Or maybe even XMLUnit?

I'll need to do stuff like detect when certain attibutes are different from similar nodes, e.g.:

<root>
  <customers>
    <customer name="Fred">
      <contact-details email="[email protected]">
    </customer>
    <customer name="Barney">
      <contact-details email="[email protected]">
    </customer>
    <customer name="Fred">
      <contact-details email="[email protected]">
    </customer>
  </customers>
</root>

So it would detect that the email addresses above are different, as one rule. Also needs to validate the data in the xml against external reference data, e.g. postcode lookups, etc.

Any suggestions as to what I should use to easily specify rules, that runs pretty quick? The XML and external data is moderately large - can be several MB per xml.

Some of the rules would be simple, and can be done in XPath - check fields are a certain length, or check certain odes have certain attributes populated - XSD is suitable here. But not for checking against external data, or doing the above. Is XQuery what I want?

Thanks if you can help.

-Justin

A: 

Is there an all of the above?

XSD will provide you with basic XML validation. It will allow you to make sure that elements have the right childen and the right attributes and are in the right order, etc. If you're comparing XML against external XML data then you'll want to use XMLUnit. If you're comparing fields in the XML against external fields from somewhere else (ex: making sure a postal code is a valid postal code) then you might want to use XPath. You'll want to learn what the capabilities are for each tool as they can all assist with validation.

Pace
A: 

You should take a look at Schematron

The Schematron differs in basic concept from other schema languages in that it not based on grammars but on finding tree patterns in the parsed document. This approach allows many kinds of structures to be represented which are inconvenient and difficult in grammar-based schema languages. If you know XPath or the XSLT expression language, you can start to use The Schematron immediately.

Mads Hansen
Thanks Mads, I'll take a look!How do I get it to validate against external references? Do I have to merge these in to the xml under test?I like that approach, I could maintain the schematron rules externally, then construct a schematron document from these, that I apply against the xml. Definitely could work!Thanks again,Justin
Justin