tags:

views:

303

answers:

2

I have been expeirmenting with Watir, Nokogir and Hpricot. All of these use top->down approach which is my problem. i.e. they use element type to search element. I want to find out the element using the text without knowing element type. e.g.

<element1> 
    <element2> Text2 </element2>
    <element3> Text3 </element3>
     text4
</element1>

I want is to get element2 and element1 etc by searching for Text2 and Text3.

Please note that I do not know if elements are divs or tr/tds or links etc. I just know the text. Algorithem should be something like : iterated through all the elements, match inner text, if match get me the element and the parent element.

Let me kow if this is possible in any way?

+1  A: 

I don't have a complete answer, but you can use the text() functionality, outlined in the wiki (See Searching Inner HTML).

doc.search("*[text()='Text3']")

will return

#<Hpricot::Elements[{elem <element3> " Text3 " </element3>}, " Text3 "]>

You could then iterate through these and check they are actual elements:

doc.search("*[text()='Text3']")[0].elem?

Would return true. Whereas [1] would return false. However, where this falls down is if you were trying to find text4 as this returns:

#<Hpricot::Elements["\n     text4\n"]>

i.e. not the actual element. So perhaps in these instances (how you determine these instances I don't know) you could check whether it's an element, and if false get the parent

doc.search("*[text()='text4']")[0].parent

Sorry I don't have a complete answer, but thought the "text()" thing would be worth mentioning for now.

i5m
+1  A: 

Watir has XPath support. I am not really familiar with XPath but I am pretty sure it would do what you need. Something like:

browser.element_by_xpath("some_xpath_magic").click

I would also suggest posting your question at watir-general.

Željko Filipin