views:

81

answers:

2

I'm writing an XSLT 1.0 stylesheet to transform multi-namespace XML documents to HTML. At some place in the result HTML I want to list all the namespaces, that occured in the document.

Is this possibile?

I thought about something like

<xsl:for-each select="//*|//@*">
  <xsl:value-of select="namespace-uri(.)" />
</xsl:for-each>

but of course I'd get gazillions of duplicates. So I'd have to filter somehow, what I already printed.

Recursively calling templates would work, but I can't wrap my head around how to reach all elements.

Accessing //@xmlns:* directly doesn't work, because one can't access this via XPath (one isn't allowed to bind any prefix to the xmlns: namespace).

+4  A: 

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output method="text"/>
 <xsl:template match="/">
   <xsl:for-each select=
    "//namespace::*[not(. = ../../namespace::*)]">
     <xsl:value-of select="concat(.,'&#xA;')"/>
   </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

when applied on this XML document:

<authors xmlns:user="mynamespace">
  <?ttt This is a PI ?>
  <author xmlns:user2="mynamespace2">
    <name idd="VH">Victor Hugo</name>
    <user2:name idd="VH">Victor Hugo</user2:name>
    <nationality xmlns:user3="mynamespace3">French</nationality>
  </author>
</authors>

produces the wanted, correct result:

http://www.w3.org/XML/1998/namespace
mynamespace
mynamespace2
mynamespace3

Update:

As @svick has commented, the above solution will still occasionally produce duplicate namespaces such as with the following XML document:

<authors xmlns:user="mynamespace">
  <?ttt This is a PI ?>
  <author xmlns:user2="mynamespace2">
    <name idd="VH">Victor Hugo</name>
    <user2:name idd="VH">Victor Hugo</user2:name>
    <nationality xmlns:user3="mynamespace3">French</nationality>
  </author>
  <t xmlns:user2="mynamespace2"/>
</authors>

the namespace "mynamespace2" will be produced twice in the output.

The following transformation fixes this issue:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ext="http://exslt.org/common"
 exclude-result-prefixes="ext">
 <xsl:output method="text"/>

 <xsl:key name="kNSbyURI" match="n" use="."/>

 <xsl:template match="/">
   <xsl:variable name="vrtfNS">
       <xsl:for-each select=
        "//namespace::*[not(. = ../../namespace::*)]">
         <n><xsl:value-of select="."/></n>
       </xsl:for-each>
   </xsl:variable>

   <xsl:variable name="vNS" select="ext:node-set($vrtfNS)/*"/>

   <xsl:for-each select=
    "$vNS[generate-id()
         =
          generate-id(key('kNSbyURI',.)[1])
         ]">
     <xsl:value-of select="concat(., '&#xA;')"/>
   </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the above XML document, it produces only all unique namespaces in the document:

http://www.w3.org/XML/1998/namespace
mynamespace
mynamespace2
mynamespace3

Part II: An XSLT 2.0 solution.

The XSLT 2.0 solution is a simple XPath 2.0 one-liner:

distinct-values(//namespace::*)
Dimitre Novatchev
This doesn't work correctly when one namespace is defined on two places: `<root><node xmlns="ns" /><node xmlns="ns" /></root>`.
svick
@svick: Good catch! I have fixed this issue now.
Dimitre Novatchev
Thanks for the answer! I was looking for a non-EXSLT solution, since I need the template to be cross-engine executable. But if I'll rewrite it in XSLT 2.0 sometime in the future, I'm coming back for the node-set solution.
Boldewyn
Thanks. I haven't realized that it would be that easy in XSLT 2.0.
Boldewyn
+3  A: 

Another without extension functions:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:template match="*">
        <xsl:param name="pNamespaces" select="'&#xA;'"/>
        <xsl:variable name="vNamespaces">
            <xsl:variable name="vMyNamespaces">
                <xsl:value-of select="$pNamespaces"/>
                <xsl:for-each select="namespace::*
                                        [not(contains(
                                                 $pNamespaces,
                                                 concat('&#xA;',.,'&#xA;')))]">
                    <xsl:value-of select="concat(.,'&#xA;')"/>
                </xsl:for-each>
            </xsl:variable>
            <xsl:variable name="vChildsNamespaces">
                <xsl:apply-templates select="*[1]">
                    <xsl:with-param name="pNamespaces"
                                        select="$vMyNamespaces"/>
                </xsl:apply-templates>
            </xsl:variable>
            <xsl:value-of select="concat(substring($vMyNamespaces,
                                                   1 div not(*)),
                                         substring($vChildsNamespaces,
                                                   1 div boolean(*)))"/>
        </xsl:variable>
        <xsl:variable name="vFollowNamespaces">
            <xsl:apply-templates select="following-sibling::*[1]">
                <xsl:with-param name="pNamespaces" select="$vNamespaces"/>
            </xsl:apply-templates>
        </xsl:variable>
        <xsl:value-of
             select="concat(substring($vNamespaces,
                                      1 div not(following-sibling::*)),
                            substring($vFollowNamespaces,
                                      1 div boolean(following-sibling::*)))"/>
    </xsl:template>
</xsl:stylesheet>

Output (With Dimitre's input sample):

http://www.w3.org/XML/1998/namespace
mynamespace
mynamespace2
mynamespace3

EDIT: Also this XPath expression:

//*/namespace::*[not(. = ../../namespace::*|preceding::*/namespace::*)]

As proof, this stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:output method="text"/>
    <xsl:template match="/">
        <xsl:for-each select="//*/namespace::*
                                     [not(. = ../../namespace::*|
                                              preceding::*/namespace::*)]">
            <xsl:value-of select="concat(.,'&#xA;')"/>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

Output:

http://www.w3.org/XML/1998/namespace
mynamespace
mynamespace2
mynamespace3

EDIT 4: Same efficient as two pass transformation.

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:output method="text"/>
    <xsl:key name="kElemByNSURI"
             match="*[namespace::*[not(. = ../../namespace::*)]]"
              use="namespace::*[not(. = ../../namespace::*)]"/>
    <xsl:template match="/">
        <xsl:for-each select=
            "//namespace::*[not(. = ../../namespace::*)]
                           [count(..|key('kElemByNSURI',.)[1])=1]">
            <xsl:value-of select="concat(.,'&#xA;')"/>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

Output:

http://www.w3.org/XML/1998/namespace
mynamespace
mynamespace2
mynamespace3

EDIT 5: When you are dealing with a XSLT processor without namespace axe implementation (Like TransforMiix), you can only extract namespaces actually used with this stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:output method="text"/>
    <xsl:key name="kElemByNSURI" match="*|@*" use="namespace-uri()"/>
    <xsl:template match="/">
        <xsl:for-each select=
            "(//*|//@*)[namespace-uri()!='']
                       [count(.|key('kElemByNSURI',namespace-uri())[1])=1]">
            <xsl:value-of select="concat(namespace-uri(),'&#xA;')"/>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

TransforMiix output:

mynamespace2
Alejandro
@Alejandro: +1 for the correct answer. We both know that not using `xxx:node-set()` results in a more inefficient solution :)
Dimitre Novatchev
+1 for a solution that avoids extensions. I was working on one that used a string to accumulate namespace URIs, but yours is probably better.
LarsH
Thanks for this solution! I missed the `namespace::` axis, and I *always* forget about `<xsl:key/>` (especially when its use would be really valuable).
Boldewyn
@Boldewyn: You are wellcome! I'm glad it helped you.
Alejandro
@Alejandro: Good solutions, but checking set membership with count is still quite inefficient. In case someone needs the most efficient solution (and in case of finding unique namespaces this is a must, because there are too-many namespace nodes in a document), the one using two passes is the answer.
Dimitre Novatchev
@Dimitre: Check closer my last edit. I'm just grouping over the same nodes you output for second pass.
Alejandro
@Alejandro: I was objecting the `count(..|key('kElemByNSURI',.)[1])=1`, but OK, it involves just two nodes.
Dimitre Novatchev
Grrr. I tried to find out, why the solution doesn't work in Firefox. Just wasted an hour, until I found out, that TransforMiiX didn't implement the namespace axis. OK, it's server-side, then.
Boldewyn
@Boldewyn: Check my edit for a not full workaround.
Alejandro
Cool, thanks. I'd give you another +1, if I wouldn't have to set up a fake account for this ;-)
Boldewyn
@Boldewyn: Thanks! Ask any time.
Alejandro