tags:

views:

1623

answers:

3

How can I specify an XML schema for an instance document like this:

<productinfo>
  <!-- other stuff -->
  <informationset type="Manufacturer">
    <!-- content not relevant -->
  </informationset>
  <informationset type="Ingredients">
    <!-- content not relevant -->
  </informationset>
</productinfo>

that is, a "productinfo" element containing a sequence of two "informationset" children, the first having @type="Manufacurer" and the second having @type="Ingredients"?

Thanks in advance.

A: 

You can, with XML Schema type, as in:

<productinfo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"&gt;
  <informationset xsi:type="Manufacturer"></informationset>
  <informationset xsi:type="Ingredients"></informationset>
</productinfo>

And the XSD defines separate complex types for each one:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"&gt;
  <xs:element name="productinfo">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="informationset" type="Manufacturer"/>
        <xs:element name="informationset" type="Ingredients"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>

  <xs:complexType name="Manufacturer">
  </xs:complexType>
  <xs:complexType name="Ingredients">
  </xs:complexType>
</xs:schema>

This is a special case for xsi:type. In general, don't think you can specify attributes to have different values in elements of the same name, because they are different definitions of the same element.

I'm not 100% clear on the precise reason - anyone know the relevant part of the spec?

13ren
A: 

You could try something like this - create a separate complexType for your "informationSet" elements, and limit the attribute to a list of valid strings:

<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" 
           xmlns:xs="http://www.w3.org/2001/XMLSchema"&gt;
  <xs:element name="productinfo">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" 
                    name="informationset" type="informationSetType" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>

  <xs:complexType name="informationSetType">
    <xs:simpleContent>
      <xs:extension base="xs:string">
        <xs:attribute name="type" type="validAttributeType" use="required" />
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>

  <xs:simpleType name="validAttributeType">
    <xs:restriction base="xs:string">
      <xs:enumeration value="Manufacturer" />
      <xs:enumeration value="Ingredients" />
    </xs:restriction>
  </xs:simpleType>
</xs:schema>

Of course, you can extend that list of valid attribute names if you wish - just add more elements to the restriction-enumeration list there.

Marc

marc_s
A: 

(I'm the guy who asked the question initially. I don't seem to be able to modify it or comment on the other answers since I wasn't registered when I asked it, so posting this as an "answer".)

Marc: The problem is not to restrict the possible attribute values (I already know how to do that), but to require that the first "informationset" element have one specific value for its "type" attribute and that the next one have another value, etc.

13ren: I'm not sure I understand what you're suggesting. It looks like you're putting a reference to the schema in the instance document. If so, I don't see how that will help.

I think the basic problem is as you described: the W3C schema language doesn't allow two elements with the same name in the same scope to have different types (as in your suggested XSD).

Perhaps Relax-NG might be a better choice for validating documents whose sections are differentiated by attribute values as opposed to element names.

Does anybody else think so?

It's XML Schema's version polymorphism, and it allows the same element to have different types. The XML Schema Primer is mostly about this. Here's a document in it that uses xsi:type like my answer: http://www.w3.org/TR/2000/WD-xmlschema-0-20000407/#UseDerivInInstDocs
13ren