tags:

views:

99

answers:

2

Hi,

I've always understood XMLSchemas and DTDs to be equivalent but that the latter is more cumbersome to use when modeling complex relationships (like inheritance).

Recently I wanted to build a schema to validate documents that have a structure like this:

<data>
 <array>
   <int></int>
   <int></int>
 </array>
 </array>
   <float></float>
   <float></float>
 </array>
 <int><int>
 <float></float>
</data>

The elements inside < data > can appear in any order and each is of cardinality 0..* Using XMLSchema, if I define a complex type using < xs:all > I can have the elements out of order but the maximum cardinality is 1. < xs:sequence > and < xs:choice > are the other obvious candidates but they're more restrictive than what I want.

Then I noticed that a DTD seems to be able to achieve this like so:

<!ELEMENT data (array | float | int)*>

Is there any way to build an equivalent schema or do I have to use DTDs here?

+1  A: 

It is only doable by means of XSD if you keep the order of your elements (so you can use a xs:sequence). I mean, a float always will come after an array (if any), and and an int will always come after a float (if any), taking into account that you can repeat as many ocurreces as you wish of each type (or omiting them completely).

The reason is that XSD xs:all complex type does not support unbounded attribute for any of its content types (elements, other nested group types, etc.). Other more "relaxed" schema will allow you to do so, such as DTD, as you state, or RelaxNG for example.

Here is a sample XSD that fits your XML file:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">
    <xs:complexType name="arrayType">
      <xs:sequence>
       <xs:element name="array" type="arrayType" minOccurs="0" maxOccurs="unbounded"/>
       <xs:element name="int" minOccurs="0" maxOccurs="unbounded"/>
       <xs:element name="float" minOccurs="0" maxOccurs="unbounded"/>
      </xs:sequence>
    </xs:complexType>
    <xs:element name="data" type="arrayType"/>
</xs:schema>
Fernando Miguélez
Thank you Fernando. I can't assume a particular sequence due to the way the data will be generated so I'll look at some alternatives. I don't want to use a DTD because I can't validate the element data except very loosely. That leaves RelaxNG *fingers crossed*.
Daniel
A: 

I thought I'd come back to this as the previous answer is incorrect. Infact, one can solve the original problem using XML Schema.

The correct approach is to define a group element which is contains a choice between all the various options (ints, floats, arrays) and each one has cardinality 0..*.

<xs:group name="dataTypesGroup">
    <xs:choice>
        <xs:element name="int" type="intType"/>
        <xs:element name="float" type="floatType"/>
        <xs:element name="array">
            <xs:complexType>
                <xs:choice>
                    <xs:element name="int" type="xs:integer" minOccurs="0" maxOccurs="unbounded"/>
                    <xs:element name="float" type="xs:float" minOccurs="0" maxOccurs="unbounded"/>
                </xs:choice>
                <xs:attribute name="id" use="required"></xs:attribute>
            </xs:complexType>    
        </xs:element>
    </xs:choice>
</xs:group>

From here, it remains to reference the group in a complexType definition and set the cardinality of the group as 0..*

<xs:element name="data" minOccurs="0" maxOccurs="unbounded">
    <xs:complexType>
        <xs:group ref="dataTypesGroup" minOccurs="0" maxOccurs="unbounded"/>
    </xs:complexType>
</xs:element>

et voila. a bit verbose (especially compared to RelaxNG's syntax) but the upside is that XML Schema is much better supported. I had crafted a RelaxNG based parser to solve the original problem but the available validators (like JING) are rather more clunky than using the XML Schema based tools that ship with Java et al.

Daniel