tags:

views:

80

answers:

1

I am new to parsers. I like to fetch specific data from a website. I need to use parsers for that. How to get started with parsers? What do I need to download? What would the code be to fetch the data from a website using parsers in Java?

A: 

My advice would be to use an open source HTML parser such as HTMLCleaner - http://htmlcleaner.sourceforge.net/

You can use HTMLCleaner (or similar) to create a representation of the web page DOM, and then use this to extract whatever information you want from the web pages.

The process looks something like this:

URL url = new URL("website you want to load");
HTMLCleaner h = new HTMLCleaner();
TagNode HtmlNode = h.clean(url.openStream());
//perform queries on the DOM to extract information
Finbarr