views:

32

answers:

2

Is there a built-in schema datatype for xhtml data? Suppose I want to specify a "boozle" element that contains two "woozles", each of which is arbitrary xhtml. I want to write something like this, using the relax NG compact syntax:

namespace nifty = "http://brinckerhoff.org/nifty/"

start = element nifty:boozle {woozle, woozle}

woozle = element nifty:woozle {xhtml}

Unfortunately, xmllint then signals this error:

./lab.rng:43: element ref: Relax-NG parser error : Reference xhtml has no matching definition ./lab.rng:43: element ref: Relax-NG parser error : Internal found no define for ref xhtml

So my question is this: is there something sensible that I should put in place of "xhtml" above?

A: 

Your woozles and boozles are in your namespace, while the xhtml elements are in the xhtml namespace. A schema validates a namespace - your schema validates your namespace and the xhtml schema validates the xhtml namespace. You can restrict an element to contain xhtml by mandating that all its child elemenents are in the xhtml namespace, but your schema should not be validating the xhtml namespace itself.

You can use the xhtml schema to validate any xhtml namespace nodes in your document. You add this schema to your processing pipeline, that is, a second validation step.

mdma
We're not *quite* connecting here. If I understand you correctly, you're suggesting that it's not possible to specify a relax NG schema that validates both the elements of the nifty namespace and those of the xhtml namespace, and that strikes me as unfortunate but plausible. Are you also suggesting, though, that there's *nothing* I can put in place of the "xhtml" above, and that I must fall back to checking the schema by manually writing validation code? That would be unfortunate.
John Clements
The validation is in two parts: in the nifty schema, you say that woozle children are in the xhtml namespace. You then also use the xhtml schema (http://www.w3.org/TR/xhtml1-schema/) to validate xhtml nodes. I imagine the simplest way to do this is to use a xml pipeline with two schema validation passes - one for the nifty relaxng schema and one for the xhtml w3c schema.
mdma
Urg... so close. To return to my original question: *what* do I put in the Relax NG schema to indicate that the woozle children are in the xhtml namespace? I just tried several reasonable guesses, with no success. In particular, what I believe I really need here is a reference to a datatype, a la xsd.
John Clements
Okay, I'm edging away from agreement here. Specifically, I think I now believe that you're mistaken when you suggest that a schema validates a namespace. In particular, it appears to me that schemas and namespaces are essentially orthogonal.
John Clements
A: 

Ahhh..... okay, more quality time with the Relax NG documentation suggests two possible solutions to this problem.

1) Use name classes to specify an "anyElement" that matches everything, like this:

anyElement =
  element * {
    (attribute * { text }
     | text
     | anyElement)*
  }

This is moderately horrible, because it simply disables checking for these elements. With this definition, though, I could put "anyElement" in place of "xhtml", above.

2) It appears to me that a better solution would involve using Relax NG's "include" directive to include a full specification of xhtml, assuming I could find one.

John Clements
Is it considered very bad form to give myself credit for the answer here?
John Clements
A better solution is to use an XML pipeline with several schema validators, one for your namespace and one for the xhtml namespace. I link to the xhtml schema in my answer comments.
mdma
This may come across as ignorance or obstinacy, but it's not clear to me how to translate your suggestion into a declarative solution. Put simply: are you suggesting a way of writing a Relax NG specification in such a way that I can use jing or xmllint or another similar tool to validate my xml, or are you suggesting something more complex?BTW: Many thanks for your time.
John Clements