views:

378

answers:

2

I am writing a little screen-scraping app that consumes some XHTML - it goes without saying that the XHTML is invalid: ampersands aren't escaped as &.

I am using Android's XmlPullParser and it spews out the following error upon the incorrectly encoded value:

org.xmlpull.v1.XmlPullParserException: unterminated entity ref 
(position:START_TAG <a href='/Fahrinfo/bin/query.bin/dox?ld=0.1&n=3&i=9c.0323581.1266265347&rt=0&vcra'>
@55:134 in java.io.InputStreamReader@43b1ef70) 

How do I get around this? I have thought about the following solutions:

  1. Wrapping the InputStream in another one that replaces the ampersands with entity refs
  2. Configuring the Parser so it magically accepts the incorrect markup

Which ones is likely to be more successful?

+1  A: 

I would go with your first option, replacing the ampersands seems more of a fit solution than the other. The second option seems more of a hack to get it to work by accepting incorrect markup.

Anthony Forloney
A: 

do the kxml can't parse html??

fanteng