views:

85

answers:

2
+1  Q: 

how to parse html

Hi All, I have downloaded java Html Parser but I dont know how to use the API for extracting the HTML data can u give some example so that i can work on it?

A: 

Take a look there HTMLParser

taichimaro
Any code sample please? Then the OP can compare.
BalusC
+2  A: 

You're talking about HtmlParser? Rather pick a parser with less verbose API like Jsoup. All you need to learn are then CSS selectors which are already obvious enough to the average frontend developer.

Here's a kickoff example which displays your current question and the names of all answerers:

package com.stackoverflow.q3416036;

import java.net.URL;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class Test {

    public static void main(String[] args) throws Exception {
        URL url = new URL("http://stackoverflow.com/questions/3416036");
        Document document = Jsoup.parse(url, 3000);

        String question = document.select("#question .post-text").text();
        System.out.println("Question: " + question);

        Elements answerers = document.select("#answers .user-details a");
        for (Element answerer : answerers) {
            System.out.println("Answerer: " + answerer.text());
        }
    }

}

See also:

BalusC