tags:

views:

57

answers:

2

How would I select all the tables between the table whose id is header_completed and the first table after the header_completed one that has an align of center? Here is the html I am selecting it from:

<table border="0" cellpadding="0" cellspacing="0" width="920" align="center"></table>
<table border="0" cellpadding="0" cellspacing="0" width="920"></table>
<table border="0" cellpadding="0" cellspacing="0" width="920"></table>
<table border="0" cellpadding="0" cellspacing="0" width="920" align="center" class="header_completed"></table>
<table border="0" cellpadding="0" cellspacing="0" width="920"></table> <--
<table border="0" cellpadding="0" cellspacing="0" width="920"></table> <--
<table border="0" cellpadding="0" cellspacing="0" width="920"></table> <-- these 5
<table border="0" cellpadding="0" cellspacing="0" width="920"></table> <--
<table border="0" cellpadding="0" cellspacing="0" width="920"></table> <--
<table border="0" cellpadding="0" cellspacing="0" width="920" align="center"></table>
<table border="0" cellpadding="0" cellspacing="0" width="920"></table>
<table border="0" cellpadding="0" cellspacing="0" width="920"></table>
<table border="0" cellpadding="0" cellspacing="0" width="920" align="center"></table>

I tried using //table[@id="header_completed"]/following-sibling::node()[following-sibling::table[@align="center"][1]] but it didn't work.

+4  A: 

Use the Kayessian method of node-set intersection:

The intersection of two node-sets $ns1 and $ns2 is evaluated by the following XPath expression:

$ns1[count(.| $ns2)=count($ns2)]

If we have the following XML document:

<t>
    <table border="0" cellpadding="0" cellspacing="0" width="920" align="center"></table>
    <table border="0" cellpadding="0" cellspacing="0" width="920"></table>
    <table border="0" cellpadding="0" cellspacing="0" width="920"></table>
    <table border="0" cellpadding="0" cellspacing="0" width="920" align="center" class="header_completed"></table>
    <table border="0" cellpadding="0" cellspacing="1" width="920"></table>
    <table border="0" cellpadding="0" cellspacing="2" width="920"></table>
    <table border="0" cellpadding="0" cellspacing="3" width="920"></table>
    <table border="0" cellpadding="0" cellspacing="4" width="920"></table>
    <table border="0" cellpadding="0" cellspacing="5" width="920"></table>
    <table border="0" cellpadding="0" cellspacing="0" width="920" align="center"></table>
    <table border="0" cellpadding="0" cellspacing="0" width="920"></table>
    <table border="0" cellpadding="0" cellspacing="0" width="920"></table>
    <table border="0" cellpadding="0" cellspacing="0" width="920" align="center"></table>
</t>

then according to the question, we have:

$ns1 is:

/*/*[@class='header_completed'][1]
                     /following-sibling::*

$ns2 is:

/*/*[@class='header_completed'][1]
             /following-sibling::*[@align='center'][1]
                   /preceding-sibling::*

We simply substitute $ns1 and $ns2 in the Kayessian formula and get the following XPath expression, which selects exactly the wanted 5 elements:

/*/*[@class='header_completed'][1]
                         /following-sibling::*
              [count(.|/*/*[@class='header_completed'][1]
                            /following-sibling::*[@align='center'][1]
                               /preceding-sibling::*)
              =
               count(/*/*[@class='header_completed'][1]
                            /following-sibling::*[@align='center'][1]
                                /preceding-sibling::*)
              ]

To verify that this is really the solution, we use this XSLT transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:variable name="ns1" select=
      "/*/*[@class='header_completed'][1]
                         /following-sibling::*
      "/>
    <xsl:variable name="ns2" select=
       "/*/*[@class='header_completed'][1]
                 /following-sibling::*[@align='center'][1]
                       /preceding-sibling::*
       "/>

    <xsl:template match="/">
        <xsl:copy-of select=
       "$ns1[count(.| $ns2)=count($ns2)]
       "/>
        <DELIMITER/>
        <xsl:copy-of select=
       "/*/*[@class='header_completed'][1]
                         /following-sibling::*
              [count(.|/*/*[@class='header_completed'][1]
                            /following-sibling::*[@align='center'][1]
                               /preceding-sibling::*)
              =
               count(/*/*[@class='header_completed'][1]
                            /following-sibling::*[@align='center'][1]
                                /preceding-sibling::*)
              ]
       "/>
    </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the XML document above, the wanted correct result is produced:

<table border="0" cellpadding="0" cellspacing="1" width="920"/>
<table border="0" cellpadding="0" cellspacing="2" width="920"/>
<table border="0" cellpadding="0" cellspacing="3" width="920"/>
<table border="0" cellpadding="0" cellspacing="4" width="920"/>
<table border="0" cellpadding="0" cellspacing="5" width="920"/>
<DELIMITER/>
<table border="0" cellpadding="0" cellspacing="1" width="920"/>
<table border="0" cellpadding="0" cellspacing="2" width="920"/>
<table border="0" cellpadding="0" cellspacing="3" width="920"/>
<table border="0" cellpadding="0" cellspacing="4" width="920"/>
<table border="0" cellpadding="0" cellspacing="5" width="920"/>

XPath 2.0 solution:

In XPath 2.0 we can use the intersect operator and the >> and/or the << operators.

The XPath 2.0 expression that corresponds to the previously used XPath 1.0 expression is:

     /*/*[ .
        >>
         /*/*[@class='header_completed'][1]
         ]

  intersect

    /*/*[ /*/*[@class='header_completed'][1]
                 /following-sibling::*[@align='center'][1]
             >>
              .
        ]

Here is an XSLT 2.0 solution, proving the correctness of this XSLT 2.0 expression:

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:variable name="ns1" select=
  "/*/*[ .
        >>
         /*/*[@class='header_completed'][1]
       ]
  "/>

    <xsl:variable name="ns2" select=
       "/*/*[ /*/*[@class='header_completed'][1]
                 /following-sibling::*[@align='center'][1]
             >>
              .
             ]
       "/>

 <xsl:template match="/">
   <xsl:sequence select="$ns1 intersect $ns2"/>
  <DELIMITER/>
   <xsl:sequence select=
   "/*/*[ .
        >>
         /*/*[@class='header_completed'][1]
       ]

  intersect

    /*/*[ /*/*[@class='header_completed'][1]
                 /following-sibling::*[@align='center'][1]
             >>
              .
        ]
   "/>
 </xsl:template>
</xsl:stylesheet>

when applied on the XML document defined before, we again get the same wanted, correct result:

<table border="0" cellpadding="0" cellspacing="1" width="920"/>
<table border="0" cellpadding="0" cellspacing="2" width="920"/>
<table border="0" cellpadding="0" cellspacing="3" width="920"/>
<table border="0" cellpadding="0" cellspacing="4" width="920"/>
<table border="0" cellpadding="0" cellspacing="5" width="920"/>
<DELIMITER/>
<table border="0" cellpadding="0" cellspacing="1" width="920"/>
<table border="0" cellpadding="0" cellspacing="2" width="920"/>
<table border="0" cellpadding="0" cellspacing="3" width="920"/>
<table border="0" cellpadding="0" cellspacing="4" width="920"/>
<table border="0" cellpadding="0" cellspacing="5" width="920"/>
Dimitre Novatchev
+1 Very elegant solution. One minor issue: you have hard-coded the use of the *third* element with `@align='center'`, whereas OP indicates the *first element after* `@class='header_completed'` that has `@align='center'`.
Niels van der Rest
@Niels-van-der-Rest: Yes, thank you for noticing this. I will update my answer later today.
Dimitre Novatchev
+1 A very insightful answer and one that is very useful. It had great explanations and proofs. Thank you.
Alex Nolan
+2  A: 

I believe this XPath expression selects the nodes you want:

//table[@class="header_completed"]/following-sibling::table[@align="center"][1]/preceding-sibling::table[preceding-sibling::table[@class="header_completed"]]

First I navigate to the table with @class="header_completed". From there I select the first following sibling table with @align="center". From there I select all preceding sibling tables that have a preceding sibling which is the table with @class="header_completed".

mwittrock
Wrikken
+1 A great solution and one that is easy to comprehend. Thank you very much.
Alex Nolan
@Wrikken That is a challenge for sure. I'll have to think about that one.
mwittrock