A good XSLT solution will map your human-readable rules to simple template rules. Here are the rules, in your words:
<SmallWidget>
in specA means the same thing as <Atom>
in specB, so just rename the element.
<Widgets>
in specA means the same thing as <Molecule>
in specB, so just rename the element.
- Wrap
<Atom>
and <Molecule>
in an element named <Widgets>
, which means something different from specA's <Widgets>
.
- Everything else gets copied as is, but in the new namespace.
Let's give it a go:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:in="http://widgetspecA.com/ns"
xmlns="http://widgetspecB.com/ns"
exclude-result-prefixes="in">
<!-- 1. Rename <SmallWidget> -->
<xsl:template mode="rename" match="in:SmallWidget">Atom</xsl:template>
<!-- 2. Rename <Widgets> -->
<xsl:template mode="rename" match="in:Widgets">Molecule</xsl:template>
<!-- 3. Wrap <Atom> & <Molecule> with <Widgets> -->
<xsl:template match="in:SmallWidget">
<!-- ASSUMPTION: in:Widgets immediately follows in:SmallWidget -->
<Widgets>
<xsl:apply-templates mode="convert" select="."/>
<xsl:apply-templates mode="convert" select="following-sibling::in:Widgets"/>
</Widgets>
</xsl:template>
<!-- Skip by this in regular processing;
it gets explicitly converted inside <Widgets> (see above) -->
<xsl:template match="in:Widgets"/>
<!-- Also, don't copy whitespace appearing
immediately before in:Widgets -->
<xsl:template match="text()
[following-sibling::node()[1][self::in:Widgets]]"/>
<!-- 4: Everything copied as is, but in the new namespace -->
<!-- Copy non-element nodes as is -->
<xsl:template match="@* | text() | comment() | processing-instruction()">
<xsl:copy/>
</xsl:template>
<!-- By default, just convert elements to new namespace
(exceptions under #3 above) -->
<xsl:template match="*">
<xsl:apply-templates mode="convert" select="."/>
</xsl:template>
<xsl:template mode="convert" match="*">
<!-- Optionally rename the element -->
<xsl:variable name="name">
<xsl:apply-templates mode="rename" select="."/>
</xsl:variable>
<xsl:element name="{$name}">
<xsl:apply-templates select="@* | node()"/>
</xsl:element>
</xsl:template>
<!-- By default, just use the same local
name as in the input document -->
<xsl:template mode="rename" match="*">
<xsl:value-of select="local-name()"/>
</xsl:template>
</xsl:stylesheet>
Note that it's important that you use the local-name()
function and not the name()
function. If you use name()
, your stylesheet will break if your input document starts using a namespace prefix that isn't explicitly declared in your stylesheet (unless you add the namespace
attribute to <xsl:element>
to enforce the namespace even when a prefix appears). However, if we use local-name()
, we're safe; it won't ever include the prefix, so the result element will adopt our stylesheet's default namespace.
Running the above stylesheet against your sample input document yields exactly what you requested:
<Root xmlns="http://widgetspecB.com/ns">...any...<WidgetBox>...any...
<Widgets><Atom>
...any...
</Atom><Molecule>
...any...
</Molecule></Widgets>...any...
</WidgetBox>...any...</Root>
Let me know if you have any questions. Ain't XSLT powerful!
P.S. If I wanted to be really precise on replicating the whitespace as in your example, I could have used step-wise, "chain" processing, where I apply templates to just one node at a time and each template rule is responsible for continuing processing onto its following sibling. But that seemed like overkill for this situation.
UPDATE:
The new solution you posted is very reasonable. It can be simplified some though. I've taken your new solution and made some recommended changes below, along with comments indicating what I changed and why I made those changes.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:old="http://widgetspecA.com/ns"
xmlns="http://widgetspecB.com/ns"
exclude-result-prefixes="old">
<!-- "xml" is the default; no real need for this
<xsl:output method="xml"/>
-->
<!-- This works fine if you only want to copy elements, attributes,
and text. Just be aware that comments and PIs will get
effectively stripped out, because the default template rule
for those is to do nothing.
-->
<xsl:template match="*">
<xsl:element name="{name()}">
<xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<xsl:template match="old:SmallWidget" mode="single">
<Atom>
<xsl:apply-templates/>
</Atom>
</xsl:template>
<xsl:template match="old:Widgets" mode="single">
<Molecule>
<xsl:apply-templates/>
</Molecule>
</xsl:template>
<!-- You actually only need one rule for <old:SmallWidget>.
Why? Because the behavior of this rule will always
be exactly the same as the behavior of the other rule
you supplied below.
-->
<xsl:template match="old:SmallWidget"> <!--[following-sibling::old:Widgets]">-->
<Widgets>
<!-- "." means exactly the same thing as "self::node()" -->
<xsl:apply-templates select="." mode="single"/>
<!-- If the node-set is empty, then this will be a no-op anyway,
so it's safe to have it here even for the case when
<old:Widgets> is not present in the source tree. -->
<!-- This XPath expression ensures
that you only process the next
sibling element - and then only
if it's name is <old:Widgets>.
Your schema might not allow it,
but this is a clearer communication
of your intention, and it will also
work correctly if another
old:SmallWidget/old:Widget pair
appeared later in the document.
-->
<xsl:apply-templates select="following-sibling::*[1][self::old:Widgets]"
mode="single"/>
</Widgets>
</xsl:template>
<!-- updated this predicate for the
same reason as above. Answers the
question: Is the element right before
this one a SmallWidget? (as opposed to:
Are there any SmallWidget elements
before this one?) -->
<xsl:template match="old:Widgets[preceding-sibling::*[1][self::old:SmallWidget]]"/>
<!-- Removed, because this rule effectively has the same behavior as the other one above
<xsl:template match="old:SmallWidget[not(following-sibling::old:Widgets)]">
<Widgets>
<xsl:apply-templates select="self::node()" mode="single"/>
</Widgets>
</xsl:template>
-->
<!-- no need for the predicate. The format of this pattern (just a name)
causes this template rule's priority to be 0. Your other rule
for <old:Widgets> above has priority of .5, which means that it
will override this one automatically. You don't need to repeat
the constraint. Alternatively, you could keep this predicate
and remove the other one. Either way it will work. (It's probably
a good idea to place these rules next to each other though,
so you can read it like an if/else statement) -->
<xsl:template match="old:Widgets"> <!--[not(preceding-sibling::*[1][self::old:SmallWidget])]">-->
<Widgets>
<xsl:apply-templates select="." mode="single"/>
</Widgets>
</xsl:template>
</xsl:stylesheet>