views:

458

answers:

3

Hi all,

I'm curious for those of you who are building iPhone apps based on REST/SOAP/XML-RPC or simply pulling down a dynamic XML feed, what does it mean exactly to you when a user says 'refresh' the feed?

The straight forward way is to populate some collection, say an NSMutableArray, with whatever you bring down from the feed. If a widget on the UI is available to refresh, I typically do something like:

[myMutableArray removeAllObjects];
// follow steps to repopulate myMutableArray

It seems this is the least efficient algorithm for refreshing an XML feed. For instance many folks who are building Twitter clients, are appending changes to their existing feed, versus bringing down the entire feed in its complete form again.

What kind of algorithms are you using to "refresh" your models when speaking to a server-side data source?

Thanks all.

A: 

One approach is using the built in NSXML pull parser in a background thread and comparing entries from the stream to what you have in memory, updating only what has changed.

Kendall Helmstetter Gelner
You don't even need a background thread, provided you use the async mode of NSURLConnection to fetch the feed first. The XML parsing is quick enough that you can do it on the main thread.
Jens Alfke
+4  A: 

You should look into using the PubSub framework if you can require OS X 10.5. It's explicitly designed to fetch and update RSS/Atom feeds.

(Disclaimer: I wrote a lot of that framework while I was at Apple :)

The answer to your question is that feeds are inherently inefficient. You can minimize this by

  1. Using HTTP "conditional GETs", so if the feed hasn't changed on the server you'll just get back a tiny 304 response. This saves time for the server and for you. (Some feed servers, like slashdot, will ban you if you don't use conditional gets!)

  2. Check the "Last-Modified:" date on the response. Yes, even if you use a conditional GET. Some servers don't handle them properly. If the date is unchanged, ignore the feed.

  3. Compare the raw data of the response against the last raw response you got. If identical, ignore the feed. (Some servers don't support conditional gets or send last-modified dates...)

  4. Now you have to parse the XML.

  5. Check the top-level mod date on the feed itself (this varies between Atom and the different flavors of RSS.) Again, if it's the same as it was last time, ignore the feed.

  6. If you got here, the feed's been updated, most likely. The easiest thing to do is to throw away all of your old saved entries and replace them with the new ones. But this means you can't keep 'historic' entries that have fallen off the end of the feed. If you want to do that, you have to go through each entry in the just-parsed feed, match it with the corresponding entry in your persistent storage, and update the persistent one based on the new one. If you couldn't find a persistent one, add it as a new entry. (Matching entries can be difficult in lame RSS feeds that don't include unique GUIDs for each entry. You have to try comparing permalinks and titles. Yuck.)

This whole thing really is a big mess. It took a lot of work to make everything behave correctly and work with all the broken feeds and servers out there; take advantage of my pain and use PubSub, if you can :)

Jens Alfke
A: 

I've just released an open source RSS/Atom Parser for iPhone and hopefully it might be of some use.

I'd love to hear your thoughts on it too!

Michael Waterfall