I am writing a little screen-scraping app that consumes some XHTML - it goes without saying that the XHTML is invalid: ampersands aren't escaped as &
.
I am using Android's XmlPullParser
and it spews out the following error upon the incorrectly encoded value:
org.xmlpull.v1.XmlPullParserException: unterminated entity ref
(position:START_TAG <a href='/Fahrinfo/bin/query.bin/dox?ld=0.1&n=3&i=9c.0323581.1266265347&rt=0&vcra'>
@55:134 in java.io.InputStreamReader@43b1ef70)
How do I get around this? I have thought about the following solutions:
- Wrapping the
InputStream
in another one that replaces the ampersands with entity refs - Configuring the Parser so it magically accepts the incorrect markup
Which ones is likely to be more successful?