We have a java widget that does some basic parsing on arbitrary xhtml documents, and we've been using jTidy to clean them up before processing.
For a couple of reasons (which are outside the scope of this particular question,) we're looking to replace jTidy with a different library.
Can anyone recommend something? We're looking for something that will take a URI, clean up the xml, and produce an object that implements org.w3c.dom.Document (or something that can be turned into a Document without too much effort or damage.)
And, as you might imagine, something that's free is also a bonus.