views:

2455

answers:

5

I am using the SIMPLE RSS reading example found at http://theappleblog.com/2008/08/04/tutorial-build-a-simple-rss-reader-for-iphone/

It uses parseXML to load the RSS feeds.

Here is the problem I am having. For the following RSS feed example, I am having trouble getting it to load the feed. Comes up with an error that it cannot connect. However on my Mac RSS Reader it works fine, so I know the link is good.

Any ideas on why it cannot load this particular feed but it can load others fine?

http://www.okstate.com/rss.dbml?db_oem_id=200&media=news

Thanks.

+1  A: 

I've been experiencing a similar issue. I haven't yet pinned down the answer, but I've noticed that RSS 2 tends to parse more successfully than the rest.

Clifton Burt
+2  A: 

In my experience, HTML markup causes an RSS parser to fail in most cases. I've experienced a problem like this with a lot of parser classes I've come across (in search of the ultimate one, which I didn't find)

My guess is that entities such as

's

are responsible for your crash. That was usually the case with my crashes. This also lead to my decision to create a 'proxy server' to pre-parse the XML before sending it to the iPhone (which gives me the advantage of caching, scaling, and some other stuff). I do believe there are solid solutions out there, but is always difficult writing a parser for so many RSS implementations.

P.S: W3C validates this feed as 'valid', so it really is 'our' problem..

MiRAGe
+2  A: 

Your problem could lie with:

  1. Unicode characters (i.e. I see some o's with two dots above them in the feed)
  2. The code you have doesn't respect CDATA sections correctly

To find out which is the case, save the feed file to your local disk and load it via your code to make sure the error happens.

Do a binary search on the file to find out if a particular RSS entry is causing the problem (i.e. remove all but the first rss entry and see if the problem exists. If it does, then the problem is there, if it doesn't put half the rss entries back in the file and repeat)

Michael Pryor
+1  A: 

There are many RSS feeds that contain invalid XML, usually because they were hacked together on the server side using HTML templates by somebody who didn't understand XML. I've seen improperly escaped (or non-escaped) HTML post contents, missing close tags, badly nested tags, and so on.

If you want to be able to parse arbitrary feeds, you have to clean up bad XML. The usual way is to use the "htmlTidy" library, which is included in the OS. This can clean up XML as well as HTML.

This example you're following uses NSXMLParser -- I have no idea why. It's a lower-level API and it doesn't support tidying. I would suggest using NSXMLDocument instead. There's a flag in that API that will tell it to use tidy when parsing the XML. This API also returns you the XML as a handy tree of elements that's easy to work with.

Jens Alfke
NSXMLDocument is not available on iPhone which is why I assume they are using it in this example.
Hunter
+1  A: 

I've just released an open source RSS/Atom Parser for iPhone and hopefully it might be of some use.

I'd love to hear your thoughts on it too!

Michael Waterfall
I just had a look at MWFeedParser... Seems that it would be cleaner if you removed all the if statements which choose between RSS1/RSS2/Atom and moved them into separate classes. Once the feed type has been determined you will be able to delegate everything to the particular implementation instead of need to check the feedtype in an if statement over and over.
Cal
I do plan to do some more optimisation at at later point, however I believe a function call will take more time than a simple in-memory comparison of an integer (the feed type). Therefore I believe it's more optimal as it is now. And I believe the code at the moment is quite clean and readable. Thanks for the comment though!
Michael Waterfall