tags:

views:

29

answers:

1

I'm running into an issue with the HtmlUnit parser where I'm trying to grab some XML from a website (using the website's API) do a quick parse of the resulting XML and then save the XML to a file (all within the rights of the API). (sample content)

Unfortunately the website returns an entity ¿ in some of the requested pages, and while this is a valid HTML entity HtmlUnit is throwing an exception during the parse with message:

The entity "iquest" was referenced, but not declared.

How do I define iquest as a valid entity?

A: 
Mark
Fair enough. I'd love to be able to intercept the stream and use the HtmlUnit parser, instead I'm taking the content stream and parsing it outside of the HU framework with these invalid entities stripped.
Mark E