tags:

views:

888

answers:

2

I've been experimenting with writing my own RSS reader. I can handle the "parse XML" bit. The thing I'm getting stuck on is "How do I fetch older posts?"

Most RSS feeds only list the 10-25 most recent items in their XML file. How do I get ALL the items in a feed, and not just the most recent ones?

The only solution I could find was using the "unofficial" Google Reader API, which would be something like

http://www.google.com/reader/atom/feed/http://fskrealityguide.blogspot.com/feeds/posts/default?n=1000

I don't want to make my application dependent on Google Reader.

Is there any better way? I noticed that on Blogger, I can do "?start-index=1&max-results=1000", and on WordPress I can do "?paged=5". Is there any general way to fetch an RSS feed so that it gives me everything, and not just the most recent items?

+1  A: 

In my experience with RSS, the feed is compiled by the last X items where X is a variable. Certain Feeds may have the full list, but for bandwidth sake most places are likely limiting to just the last few items.

The likely answer for google reader having the old info, is that it is storing it on its side for users later.

Rob Haupt
That was what I figured. Google has an older archive. I'll just import from the Google Reader API, and then make it "current" for newer items.That is annoying. If I put an RSS reader on my site, and cache old items, I'll use a *TON* of disk space.
+5  A: 

RSS/Atom feeds does not allow for historic information to be retrieved. It is up to the publisher of the feed to provide it if they want such as in the blogger or wordpress examples you gave above.

The only reason that Google Reader has more information is that it remembered it from when it came up the first time.

There is some information on something like this talked about as an extension to the ATOM protocol, but I don't know if it is actually implemented anywhere.

David Dean