tags:

views:

120

answers:

2

Hi,

I'm trying to use XSLT to transform a document by tagging a group of XML nodes with integer ids, starting at 0, and increasing by one for each node in the group. The XML passed into the stylesheet should be echoed out, but augmented to include this extra information.

Just to be clear about what I am talking about, here is how this transformation would be expressed using DOM:

states = document.getElementsByTagName("state");
for( i = 0; i < states.length; i++){
    states.stateNum = i;
}

This is very simple with DOM, but I'm having much more trouble doing this with XSLT. The current strategy I've devised has been to start with the identity transformation, then create a global variable which selects and stores all of the nodes that I wish to number. I then create a template that matches that kind of node. The idea, then, is that in the template, I would look up the matched node's position in the global variable nodelist, which would give me a unique number that I could then set as an attribute.

The problem with this approach is that the position function can only be used with the context node, so something like the following is illegal:

<template match="state">
    <variable name="stateId" select="@id"/>
    <variable name="uniqueStateNum" select="$globalVariable[@id = $stateId]/position()"/>
</template>

The same is true for the following:

<template match="state">
    <variable name="stateId" select="@id"
    <variable name="stateNum" select="position($globalVariable[@id = $stateId])/"/>
</template>

In order to use position() to look up the position of an element in $globalVariable, the context node must be changed.

I have found a solution, but it is highly suboptimal. Basically, in the template, I use for-each to iterate through the global variable. For-each changes the context node, so this allows me to use position() in the way I described. The problem is that this turns what would normally be an O(n) operation into an O(n^2) operation, where n is the length of the nodelist, as this require iterating through the whole list whenever the template is matched. I think that there must be a more elegant solution.

Altogether, here is my current (slightly simplified) xslt stylesheet:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:s="http://www.w3.org/2005/07/scxml"
    xmlns="http://www.w3.org/2005/07/scxml"
    xmlns:c="http://msdl.cs.mcgill.ca/"
    version="1.0">
    <xsl:output method="xml"/>

    <!-- we copy them, so that we can use their positions as identifiers -->
    <xsl:variable name="states" select="//s:state" />


    <!-- identity transform -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="s:state">

        <xsl:variable name="stateId">
            <xsl:value-of select="@id"/>
        </xsl:variable>

        <xsl:copy>
            <xsl:apply-templates select="@*"/>

            <xsl:for-each select="$states">
                <xsl:if test="@id = $stateId">
                    <xsl:attribute name="stateNum" namespace="http://msdl.cs.mcgill.ca/"&gt;
                        <xsl:value-of select="position()"/>
                    </xsl:attribute>
                </xsl:if>
            </xsl:for-each>

            <xsl:apply-templates select="node()"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

I'd appreciate any advice anyone can offer. Thanks.

A: 

Simplest approach:

<xsl:template match="s:state">
  <xsl:copy>
    <xsl:apply-templates select="@*"/>
    <xsl:attribute name="stateNum" namespace="http://msdl.cs.mcgill.ca/"&gt;
      <xsl:value-of select="count(preceding::s:state)" />
    </xsl:attribute>
    <xsl:apply-templates select="node()"/>
  </xsl:copy>
</xsl:template>

Not sure how your XSLT processor handles the preceding axis, so this is something to benchmark in any case.

Tomalak
Tomalak, based on my understanding of the preceding axis, this seemed like a reasonable approach. Unfortunately, however, it doesn't seem to work. Please see the following XML document:<scxml xmlns="http://www.w3.org/2005/07/scxml"> <state id="Compound1"> <state id="Basic1"/> <state id="Basic2"/> <state id="Basic3"/> </state></scxml>This will create the following state numbers:stateId: Compound1, stateNum: 0; stateId: Basic1, stateNum: 0; stateId: Basic2, stateNum: 1; stateId: Basic3, stateNum: 2Checked with Xalan, 4xslt and xsltproc.Any ideas?
echo-flow
Hm. I see what you mean. Try adding `count(ancestor::s:state)` and `count(preceding::s:state)`, but it could be this is wrong as well. No chane to test right now, I'm on my mobile phone currently. ;-)
Tomalak
+2  A: 

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:s="http://www.w3.org/2005/07/scxml"
 >
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*" name="identity">
  <xsl:copy>
    <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="s:state">
  <xsl:variable name="vNum">
   <xsl:number level="any" count="s:state"/>
  </xsl:variable>

  <xsl:copy>
   <xsl:copy-of select="@*"/>

   <xsl:attribute name="stateId">
    <xsl:value-of select="@id"/>
   </xsl:attribute>

   <xsl:attribute name="id">
     <xsl:value-of select="$vNum -1"/>
   </xsl:attribute>

   <xsl:apply-templates/>
  </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<scxml xmlns="http://www.w3.org/2005/07/scxml"&gt;
    <state id="Compound1">
        <state id="Basic1"/>
        <state id="Basic2"/>
        <state id="Basic3"/>
    </state>
</scxml>

produces the wanted, correct output:

<scxml xmlns="http://www.w3.org/2005/07/scxml"&gt;
    <state stateId="Compound1" id="0">
        <state stateId="Basic1" id="1"/>
        <state stateId="Basic2" id="2"/>
        <state stateId="Basic3" id="3"/>
    </state>
</scxml>
Dimitre Novatchev
Works perfectly, thanks!
echo-flow