tags:

views:

193

answers:

4

Possible Duplicate:
HTML/XML Parser for Java

Can any one suggest me good HTML parser?I need the following qualities in it:

  1. -Accuracy : Sould be able to parse affectively
  2. -Auto Correction : shold be able to auto parse eveen if the html page is not wellformed.(i.e even if some tags are not closed properly)
  3. -Speed

I am looking for a java HTML parser.

+1  A: 

Validator.nu's HTML parser, which is an implementation of the HTML5 parsing algorithm and is used by recent versions of Gecko.

Ms2ger
+1  A: 

I'm a happy user of Jericho HTML Parser on real HTML. It should fit your criteria.

Anthony
+1  A: 

Very happily using NekoHTML here. It is just a thin veneer over the Apache parser, enabling various error-correction features, so it has a very solid basis.

EJP