tags:

views:

499

answers:

4
<html>
    <body>
        <table>
            <tr>
                <th>HeaderA</th>
                <th>HeaderB</th>
                <th>HeaderC</th>
                <th>HeaderD</th>
            </tr>
            <tr>
                <td>ContentA</td>
                <td>ContentB</td>
                <td>ContentC</td>
                <td>ContentD</td>
            </tr>
         </table>
    </body>
</html>

I am looking for the most efficient way to select the content 'td' node based on the heading in the corresponding 'th' node..

My current xPath expression..

/html/body/table/tr/td[count(/html/body/table/tr/th[text() = 'HeaderA']/preceding-sibling::*)+1]

Some questions..

  • Can you use relative paths (../..) inside count()?
  • What other options to find current node number td[?] or is count(/preceding-sibling::*)+1 the most efficient?
A: 

I would have left Xpath aside... since I assume it was DOM parsed, I'd use a Map data structure, and match the nodes in either client side or server side (JavaScript / Java) manually.

Seems to me XPath is being streatched beyond its limit here.

Ehrann Mehdan
I still think XPath is not the best solution here, voting down won't change my mind... or the facts...
Ehrann Mehdan
I understand and appreciate your comment.. What I am looking for is the most efficient method using xPath.. I can then perform real world benchmarks in my environment using all available options (xPath, java, javascript, etc..) to settle on a final solution.. Thanks for your comment..
chameleon95
+2  A: 
  • It is possible to use relative paths inside count()
  • I have never heard of another way to find the node number...

Here is the code with relative xpath-code inside count()

/html/body/table/tr/td[count(../../tr/th[text()='HeaderC']/preceding-sibling::*)+1]

But well, it is not much shorter... It won't be shorter than this in my opinion:

//td[count(../..//th[text()='HeaderC']/preceding-sibling::*)+1]
Harmen
Excellent.. I am not so much looking for the shortest way to write the expression.. but the most efficient, to minimise the internal lookups..
chameleon95
A: 

Perhaps you want position() and XPath axes?

mst
A: 

Harmen's answer is exactly what you need for a pure XPATH solution.

If you are really concerned with performance, then you could define an XSLT key:

<xsl:key name="columns" match="/html/body/table/tr/th" use="text()"/>

and then use the key in your predicate filter:

/html/body/table/tr/td[count(key('columns', 'HeaderC')/preceding-sibling::th)+1]

However, I suspect you probably won't be able to see a measurable difference in performance unless you need to filter on columns a lot (e.g. for-each loops with checks for every row for a really large document).

Mads Hansen