views:

1236

answers:

2

I'm trying to wrap my head around xml schemas and one thing I'm trying to figure out is how to do relational type schemas where on element refers to another, possibly in another schema altogether. I've looked at the xsd:key and xsd:keyref and it seems like the sort of thing I'm interested in, but I'm not sure. Initially I just set attributes with the type xs:ID abd xs:IDREF, which obviously doesn't necessarily refer to a specific element as far as I could tell.

Basically, I have several different xml files where elements refer to other elements either in the same file or other files. It looks a lot like a relation database and I would love to use one, but the requirement is to only use XML files and so I'm at least trying to establish some sanity instead of just seemingly random strings relying on xml comments to define the relationships. It works for smaller projects, but it's certainly not scalable.

Any thoughts?

A: 

If I remember correctly, xs:ID has to be globally unique within whole document, while xs:key only has to be unique for the element for which it was defined. So the key/keyref is actually more like PK/FK. PK only have to be unique within one table.

vartec
Well that sounds like what I want then. Is it possible to create several schemas that work together to validate several xml-files as a whole. Ie, if I have a number of elements in one document that have keys, can I reference them from another document and still have it validate?
macke
I think this answer your question: "refer=QName; Required. Specifies the name of a key or unique element defined in this _or another schema_".
vartec
+2  A: 

I'm not aware of anything within XML Schema that will allow you to validate multiple XML documents against one another. In the xs:id and xs:key (etc) constraints, you use xpath to apply the constraints. You can go to XML Schema Part 1: Structures and scroll down a little bit for the example to see these constraints in action.

If you have the ability to define a meta-XML file that includes your others (perhaps by entity references if by no other way) and then use a schema for that meta file, then you should be able to use XML Schema to apply your constraints. If you define a schema for each of your XML file types, you should be able to trivially (by xs:import or xs:include) define a meta-schema for an XML file that includes all of your XML content in one XML file. This meta-schema could successfully apply the constraints you want.

Let's say you have to validate a Wiki that has many posts, where each post has an author and maybe many comments where each comment also has an author, and that you have one XML file for all posts, one for all comments, one for all authors, and you want to validate constraints between these files, that each post uses authors and comments that exist, that each comment uses authors that exist, and so on. Let's say you have the following three files:

The file /home/username/posts.xml:

<?xml version="1.0" encoding="UTF-8" ?>
<posts>
  <post>
    <author name="author1"/>
    <comment id="12345" pos="1"/>
    <comment id="12346" pos="2"/>
    <body>I really like my camera...</body>
  </post>
   ...
</posts>

The file /home/username/comments.xml:

<?xml version="1.0" encoding="UTF-8" ?>
<comments>
  <comment id="12345" author="kindguy">
    That was a very good post
  </comment>
   ...
</comments>

The file /home/username/authors.xml:

<?xml version="1.0" encoding="UTF-8" ?>
<authors>
  <author name="kindguy" id="1"/>
  <author name="author1" id="2"/>
   ...
</authors>

What I am suggesting is that you make a meta-XML file by using Entity References. For example, you could create the following XML file:

<?xml version="1.0" encoding="UTF-8" ?>
<!ENTITY postfile    SYSTEM "file:///home/username/posts.xml">
<!ENTITY commentfile SYSTEM "file:///home/username/comments.xml">
<!ENTITY authorfile  SYSTEM "file:///home/username/authors.xml">
<root>
  &postfile1;
  &commentfile;
  &authorfile;
</root>

This meta-XML file (actually, a plain old XML file ... the "meta" is only from the perspective of your three defined XML files, and not in any XML sense) is the exact equivalent of the following file, and XML parsers will act as if you truly had the following file:

<?xml version="1.0" encoding="UTF-8" ?>
<root>
  <posts>
    <post>
      <author name="author1"/>
      <comment id="12345" pos="1"/>
      <comment id="12346" pos="2"/>
      <body>I really like my camera...</body>
    </post>
     ...
  </posts>
  <comments>
    <comment id="12345" author="kindguy">
      That was a very good post
    </comment>
     ...
  </comments>
  <authors>
    <author name="kindguy" id="1"/>
    <author name="author1" id="2"/>
     ...
  </authors>
</root>

From this file, you can define an XML schema that will apply the desired constraints, even though with the individual files there is no way to apply constraints. Since using XML entity notation you have "included" all the XML into one file, you can use xpath in the contraint references.

Eddie
Unfortunately, using several xml documents is a requirement. There is a publishing system in place that overwrites any local changes. There is no way to merge and therefore it is necessary to minimize the damage done by breaking data up in several documents. It's unfortunate, but reality =/Thanks!
macke
But for validation purposes, can't you created an *additional* XML file that uses entity references to include the canonical XML files that you cannot modify? You could apply the constraints within this "temporary" XML file that does nothing but include the other XML files.
Eddie
I think I understand what you mean in an abstract sense, but I think I'd need a concrete example to fully grok it. Will the site you linked to explain this kind of relationship?
macke
I'll edit my post later today to add a small example. The XML schema standard will unfortunately not explain this.
Eddie
Thanks Eddie, I really appreciate this! Excellent mini-tutorial!
macke
Your example will work perfectly for my needs, thanks again Eddie, you're a hero!
macke
I wish I could upvote this answer again, perfect reference material!
macke