views:

138

answers:

2

Dear All:

I would like to evaluate an XPath expression relative to a given element.

I have been reading here: http://www.w3schools.com/xpath/default.asp

And it seems like one of the syntaxes below should work (esp no leading slash or descendant:)

However, none seem to work in HTMLUnit. Any help much appreciated (oh this is a groovy script btw). Thank you!

http://htmlunit.sourceforge.net/

http://groovy.codehaus.org/

Misha


#!/usr/bin/env groovy

import com.gargoylesoftware.htmlunit.WebClient

def html="""
<html><head><title>Test</title></head>
<body>
<div class='levelone'>
 <div class='leveltwo'>
    <div class='levelthree' />
 </div>
 <div class='leveltwo'>
    <div class='levelthree' />
    <div class='levelthree' />
 </div>
</div>

</body>
</html>
"""

def f=new File('/tmp/test.html')
if (f.exists()) {
 f.delete()
}
def fos=new FileOutputStream(f)
fos<<html

def webClient=new WebClient()
def page=webClient.getPage('file:///tmp/test.html')

def element=page.getByXPath("//div[@class='levelone']")
assert element.size()==1
element=page.getByXPath("div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("/div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("descendant:div[@class='levelone']") // this
gives namespace error
assert element.size()==0

Thank you!!!

A: 

It is not clear from the definition of the problem, what is the element relative to which the XPath expressions are evaluated. Assuming that this is the document node, then the following XPath expressions will select the desired node:

   */*/div[@class='levelone']

   html/body/div[@class='levelone']

   descendant::div[@class='levelone']

You may have problem if in the actual XML document (not shown), there is a default namespace. In this case you need to define / register this namespace in your XPath-hosting language (I don't know groovy) and use the associated prefix, like this:

   */*/x:div[@class='levelone']

   x:html/x:body/x:div[@class='levelone']

   descendant::x:div[@class='levelone']
Dimitre Novatchev
A: 

Thank you so much. Apparently my error was using a single semicolon after descendant rather than two (doh)

#!/usr/bin/env groovy

import com.gargoylesoftware.htmlunit.WebClient

def html="""
<html><head><title>Test</title></head>
<body>
<div class='levelone'>
  <div class='leveltwo'>
     <div class='levelthree' />
  </div>
  <div class='leveltwo'>
     <div class='levelthree' />
     <div class='levelthree' />
  </div>
</div>

</body>
</html>
"""

def f=new File('/tmp/test.html')
if (f.exists()) {
  f.delete()
}
def fos=new FileOutputStream(f)
fos<<html

def webClient=new WebClient()
def page=webClient.getPage('file:///tmp/test.html')

def element=page.getByXPath("//div[@class='levelone']")
assert element.size()==1
element=page.getByXPath("div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("/div[@class='levelone']")
assert element.size()==0
element=page.getByXPath("descendant::div[@class='levelone']")
assert element.size()==1

Doh!

Thank you!

Misha

Misha Koshelev