tags:

views:

46

answers:

3

I have an HTML snippet like this :

<a href="XXXXXXXXXXXXXXX" target="_blank" class="view_job_link">View or apply to job</a>

I want to read href value XXXXXXXXXX using Java.

Point to note: I am reading the HTML file from a URL using inputstreamreader(url.openStream()).

I am getting a complete HTML file, and above snippet is a part of that file.

How can I do this?

Thanks

Karunjay Anand

+1  A: 

Use an HTML parser like TagSoup or something similar.

Taylor Leese
+3  A: 

Use a html parser like Jsoup. The API is easy to learn and for your case,the following code snippet will do

URL url = new URL("http://example.com/");
Document doc = Jsoup.parse(url, 3*1000);
Elements links = doc.select("a[href]"); // a with href
for (Element link : links) {
   System.out.println("Href = "+link.attr("abs:href"));
}
chedine
You can also use `link.absUrl("href")`.
BalusC
A: 

You can use Java's own HtmlEditorKit for parsing html. This way you wont need to depend on any third party html parser. Here is an example of how to use it.

Gopi