tags:

views:

139

answers:

3

given this xml:

<root>
    <list>
        <!-- foo's comment -->
        <item name="foo" />
        <item name="bar" />
        <!-- another foo's comment -->
        <item name="another foo" />
    </list>
</root>

I'd like to use a XPath to select all item-nodes that have a comment immediately preceding them, that is I like to select the "foo" and "another foo" items, but not the "bar" item.

I already fiddled about the preceding-sibling axis and the comment() function but to no avail.

A: 

As mentioned in this thread, introducing a test (<xsl:if test="..."></xsl:if>) like:

preceding-sibling::comment()

would only tests whether the node has a preceding sibling that's a comment.

If you want to know, of the preceding siblings that are elements or comments, whether the nearest one is a comment, you could try:

(preceding-sibling::*|preceding-sibling::comment())[1][self::comment()] # WRONG

BUT: that won't work, because though "[1]" means first in the backwards direction for preceding-sibling, it doesn't mean that for a parenthesized expression - it means first in document order

You can try:

(preceding-sibling::*|preceding-sibling::comment())[last()][self::comment()]

or

preceding-sibling::node()[self::*|self::comment()][1][self::comment()]

For instance:

<xsl:stylesheet version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
  <xsl:output omit-xml-declaration="no" indent="no"/>

  <xsl:template match="//item">
    <xsl:if test="preceding-sibling::node()[self::*|self::comment()][1][self::comment()]">
      <xsl:value-of select="./@name" />
    </xsl:if>
  </xsl:template>
</xsl:stylesheet>

would only display:

foo
another foo

when typing:

C:\Prog\xslt\preceding-sibling_comment>
  java -cp ..\saxonhe9-2-0-6j\saxon9he.jar net.sf.saxon.Transform -s:test.xml -xsl:t.xslt -o:res.xml

with:

  • test.xml: your file displayed in your question
  • t.xslt: the xslt file above
  • res.xml: the resulting transformed file

Edit: since it doesn't take into account processing instructions, I left that answer as Community Wiki.

VonC
Thanks for your thorough answer VonC! I tried the "if" construct and it works for me. I'd rather use the expression from Kevin though, as it is something I can understand (and hopefully remember).
miasbeck
This doesn't work in the situation described by Martin Honnen.
Dimitre Novatchev
@Dimitre: +1 on your answer, and I leave mine as Community Wiki.
VonC
@VonC: Thanks, you were very close, too.
Dimitre Novatchev
+2  A: 

This seems to work:

//comment()/following-sibling::*[1]/self::item

It looks for immediately following siblings of comments which are also <item> elements. I don't know a better way to express the ::*[1]/self::item part, which is ugly; note that if it were written ::item[1] then it would also find <item>s not immediately proceded by a comment.

Kevin Reid
What happens if there is a processing instruction between the comment and the 'item' element? Certainly an edge case but the original poster first needs to clarify whether an 'item' element preceded by a processing instruction preceded by a comment is an element he wants to select.
Martin Honnen
@Kevin: Thanks, Kevin. This X-Path is simple enough for me to grasp, and it works perfectly :)@Martin: Luckily I am the master of the Xml input, so I'm sure the won't be any processing instructions. Thanks anyway for the useful hint.
miasbeck
Martin Honnen's comments are correct.
Dimitre Novatchev
+3  A: 

The currently selected solution:

//comment()/following-sibling::*[1]/self::item

doesn't work in the case where there is a procesing instruction (or a whole group of processing instructions) between the comment and the element -- as noticed in a comment by Martin Honnen.

The solution below doesn't have such a problem.

The following XPath expression selects only elements nodes that are either immediately preceded by a comment node, or are immediately preceded by a white-space-only text node, which is immediately preceded by a comment node:

(//comment()
   /following-sibling::node()
     [1]
     [self::text()
    and 
     not(normalize-space())
     ]
      /following-sibling::node()
             [1] [self::item]
 ) 
|
(//comment()
   /following-sibling::node()
     [1]
     [self::item]
 ) 

Here is a complete test:

We use this XML document:

<root>
    <list>
        <!-- foo's comment -->
        <item name="foo" />
        <item name="bar" />
        <!-- another foo's comment -->
        <item name="another foo" />
        <!-- comment 3 --><item name="immed.after 3"/>
        <!-- comment 4 --><?PI ?><item name="after PI"/>
    </list>
</root>

When the following transformation is applied on the above XML document:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>


 <xsl:template match="/">
   <xsl:copy-of select=
    "
    (//comment()
       /following-sibling::node()
         [1]
         [self::text()
        and
         not(normalize-space())
         ]
          /following-sibling::node()
                 [1] [self::item]
     )
    |
    (//comment()
       /following-sibling::node()
         [1]
         [self::item]
     )
    "/>
 </xsl:template>
</xsl:stylesheet>

the wanted, correct result is produced:

<item name="foo"/>
<item name="another foo"/>
<item name="immed.after 3"/>
Dimitre Novatchev
Good and complete answer. +1
VonC