views:

33

answers:

3

I'm developing an XSLT 1.0 stylesheet (and apply it using xsltproc). One of the templates in my script should perform some special handling for the first <sect1> element in a given parent node and some for the last <sect1> element. Right now this special handling is implemented like this:

<xsl:template match="sect1">
  <xsl:if test="not(preceding-sibling::sect1)">
    <!-- Special handling for first sect1 element goes here. -->
  </xsl:if>
  <!-- Common handling for all sect1 elements goes here. -->
  <xsl:if test="not(following-sibling::sect1)">
    <!-- Special handling for last sect1 element goes here. -->
  </xsl:if>
</xsl:template>

I was wondering (just out of curiousity, the runtime speed of the script is fine for me): is there a more efficient way to do this? Is it likely that the XSLT processor will stop assembling the preceding-sibling::sect1 node-set after the first found match because it knows that it just needs to find one or zero elements?

A: 

I think you should just be able to do

  <xsl:if test="position() = 1">
    <!-- Special handling for first sect1 element goes here. -->
  </xsl:if>
  <!-- Common handling for all sect1 elements goes here. -->
  <xsl:if test="position() = last()">
    <!-- Special handling for last sect1 element goes here. -->
  </xsl:if>

since position() and last() are context-sensitive.

AakashM
Doesn't this have quadratic complexity since position() is (probably?) an `O(n)` function?
Frerich Raabe
I just realized: this is not equivalent: position() returns the position() among *all* children. I just want to know whether it's the first `sect1` element, not the first child element at all.
Frerich Raabe
@AakashM, it depends on how it's used. I think in a match pattern or just inside a for-each loop, yes position() and last() are relevant to the context list. However, in other places, such as after "/" in an XPath expression, the context is different.
LarsH
+1  A: 

Assuming that the context the template is called in is a child-node selection then I offer the below. If the context they were called in was via a different axis (say preceding-sibling or ancestor) then the way to approach it is best.

Two possibilities are to simplify the tests, or to replace them with different templates:

Simpler tests:

<xsl:template match="sect1">
  <xsl:if test="position() = 1">
    <!-- Special handling for first sect1 element goes here. -->
  </xsl:if>
  <!-- Common handling for all sect1 elements goes here. -->
  <xsl:if test="position() = last()">
    <!-- Special handling for last sect1 element goes here. -->
  </xsl:if>
</xsl:template>

Different templates:

<xsl:template name="handleSect1">
  <!-- Common handling for all sect1 elements goes here. -->
<xsl:template>
<xsl:template match="sect1">
  <xsl:call-template name="handleSect1"/>
</xsl:template>
<xsl:template match="sect1[1]">
  <!-- Special handling for first sect1 element goes here. -->
  <xsl:call-template name="handleSect1"/>
</xsl:template>
<xsl:template match="sect1[last()]">
  <xsl:call-template name="handleSect1"/>
  <!-- Special handling for last sect1 element goes here. -->
</xsl:template>
<xsl:template match="sect1[position() = 1 and position() = last()]">
  <!-- Special handling for first sect1 element goes here. -->
  <xsl:call-template name="handleSect1"/>
  <!-- Special handling for last sect1 element goes here. -->
</xsl:template>

Since you say "optimise" I assume you care about which will process faster. It will vary according to xslt processor, processing modes (some have a "compile" option, and this will affect which is more efficient) and input XML. The fastest may be either of these or your original.

Really, every one of these should be as efficient as the other, the difference is in optimisations that the processor manages to make.

I would favour the first in my answer here in this case, as it's the most concise, but if I was to not have common handling shared between all 4 cases, I would favour the approach in the second answer, which then clearly marks different approaches for each case.

Jon Hanna
Is the first suggestion really correct? Doesn't position() return 1 only if the `sect1` element is the first child element? I want the code to be executed for the first `sect1` element, even if it's not the first child element.
Frerich Raabe
It returns the position in the given context. If the context is such that other nodes could proceed then the test could be "../sect1[1] = current()".
Jon Hanna
@Jon Hanna: I think you really don't need the separated named template, you could just add @name to the rule matching "sect1". Also to rules matching first and last, to avoid repeated code in rule matching only one occurence (so, first and last). About your last comment: to test identity you sould use `generate-id(../sect1[1]) = generate-id(current())`.
Alejandro
@Jon Hanna: And last, the first suggestions works not only under child axis template applying, but also needs an element test as in `apply-templates select="sect1"`
Alejandro
+1  A: 

Is it likely that the XSLT processor will stop assembling the preceding-sibling::sect1 node-set after the first found match because it knows that it just needs to find one or zero elements?

I don't know about xsltproc, but Saxon is very good at these sorts of optimizations. I believe it would only check for the first found match because it only needs to know whether the node-set is empty or not.

However you could always make sure by changing your tests as follows:

  <xsl:if test="not(preceding-sibling::sect1[1])">

and

  <xsl:if test="not(following-sibling::sect1[1])">

as this will only test for the first sibling along each axis. Note that the [1] in each case refers to the order of the XPath step, which is the order of the axis, not necessarily document order. So preceding-sibling::sect1[1] refers to the sect1 sibling immediately preceding the current element, not the first sect1 sibling in document order. Because the direction of the preceding-sibling axis is reverse.

LarsH
@LarsH: Yes, specific XSLT processor optimization could catch this as a shortcut, but from http://www.w3.org/TR/xpath/#predicates `A predicate filters a node-set with respect to an axis to produce a new node-set` and from http://www.w3.org/TR/xpath/#axes `the following-sibling axis contains all the following siblings of the context node` It's not clear that this is an optimization in which you can trust. In this case, as well as other, pattern matching is the best approach.
Alejandro
@Alejandro: I agree that one can't guarantee this optimization will occur without knowing what processor is being used. However the OP was asking about the likelihood of optimization and I believe there is high likelihood with Saxon. The pieces you quote from the spec specify the semantics, rather than the implementation. Even tho "the following-sibling axis contains all the following siblings of the context node", that does not mean a processor must instantiate or process all of them in order to be compliant.
LarsH