views:

78

answers:

1

I've been looking at this issue for too long. I suspect I'm missing something obvious because I'm overfamiliar with it.

I have a schema that suffers from a unique particle violation error. I can see why but I've spent too long fiddling with it to be able to step back and solve the problem.

How do I phrase this schema so that it can validate the content I need to model?

The content model looks something like:

<document>
    <extract>...</extract>
    <structure>...</structure>
    <structure>...</structure>
</document>

OR

<document>
    <structure>...</structure>
    <structure>...</structure>
</document>

OR

<document>
    <extract>...</extract>
    <extract>...</extract>
</document>

That is a document element can contain either one or more extract elements or one or more structure elements or it can contain a single extract element followed by one or more structure elements.

I have an (incorrect) schema that looks like:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"&gt;

    <xs:element name="document" type="Document"/>
    <xs:complexType name="Document">
        <xs:choice>
            <xs:sequence>
                <xs:element ref="extract" minOccurs="0"/>
                <xs:element ref="structure" minOccurs="1" maxOccurs="unbounded"/>
            </xs:sequence>
            <xs:element maxOccurs="unbounded" ref='extract'/>
        </xs:choice>
    </xs:complexType>

    <xs:element name="extract" type="xs:string"/>
    <xs:element name="structure" type="xs:string"/>

</xs:schema>

(This is a stripped down verision of a much more complex schema).

cheers

nic

+1  A: 

So you need a DTD-style content model of:

extract+|structure+|extract,structure+

The issue here being that the presence of an extract doesn't determine which branch is being taken. But we can rewrite the content model like this:

extract,(structure+|extract*)|structure+

You can see this is the same if you "expand out" the inner choice as if this was algebra:

extract,structure+|extract,extract*|structure+
extract,structure+|extract+|structure+     [[ extract,extract* === extract+ ]]

And this content model can be translated back to XSD:

<xs:complexType name="Document">
 <xs:choice>
  <xs:sequence>
   <xs:element ref="extract"/>
   <xs:choice>
    <xs:element ref="structure" maxOccurs="unbounded"/>
    <xs:element ref="extract" minOccurs="0" maxOccurs="unbounded"/>
   </xs:choice>
  </xs:sequence>
  <xs:element ref="structure" maxOccurs="unbounded"/>
 </xs:choice>
</xs:complexType>
araqnid
You're right - there's a typo in the example. Editing that.
Nic Gibson
That's exactly it - I knew I'd been looking at this for too long. Thank you.
Nic Gibson