I'm writing a Java program that scrapes a web page for links and then stores them in a database. I'm having problems though. Using HTMLUnit, I wrote the following:
page.getByXPath("//a[starts-with(@href, \"showdetails.aspx\")]");
It returns the correct anchor elements, but I only want the actual path contained in the href attribute, not the entire thing. How can I do this, and further, how can I get the data contained between nodes:
<a href="">I need this data, too.</a>
Thanks in advance!