ansaurus

Question

Feedparser - retrieve old messages from Google Reader

Answer 1

+4 A:

You're only getting a dozen entries or so because that's what the feed contains. If you want historic data you will have to find a feed/database of said data.

Check out this ReadWriteWeb article for some resources on finding open data on the web.

Note that Feedparser has nothing to do with this as your title suggests. Feedparser parses what you give it. It can't find historic data unless you find it and pass it into it. It is simply a parser. Hope that clears things up! :)

Bartek 2009-11-04 20:02:51

Thanks again Bartek. I think I understand it better now. So the RSS is simply a xml file stored in the server? I had the wrong image about it... thought it was kind of a ''protocol'' to get a text feed. Thanks again.

Rafael S. Calsaverini 2009-11-04 20:19:46

Answer 2

+3 A:

To expand on Bartek's answer: You could also start storing all of the entries in the feed that you've already seen, and build up your own historical archive of the feed's content. This would delay your ability to start using it as a corpus (because you'd have to do this for a month to build up a collection of a month's worth of entries), but you wouldn't be dependent on anyone else for the data.

I may be mistaken, but I'm pretty sure that's how Google Reader can go back in time: They have each feed's past entries stored somewhere.

Will McCutchen 2009-11-04 20:13:56

Hummm... I guess the way to go then is to get the feed from Google Reader itself, maybe?

Rafael S. Calsaverini 2009-11-04 20:17:32

It seems that Google Reader itself can be used to retrieve a historical list of items! :Dhttp://googlesystem.blogspot.com/2007/06/reconstruct-feeds-history-using-google.html

Rafael S. Calsaverini 2009-11-04 20:34:12

I just discovered this, too. Here's the last 100 items in the feed you're interested in: http://www.google.com/reader/atom/feed/http://feeds.folha.uol.com.br/folha/emcimadahora/rss091.xml?n=1000

Will McCutchen 2009-11-04 20:42:27

Problem is: I must login to google reader to be able to access this... Tried to use feedparser and only managed to get a empty entry list.Anyone knows how could I login to google reader from python to download this entry list?I really know nothing

Rafael S. Calsaverini 2009-11-04 21:02:14

Hummm... this answer from stackoverflow helped:http://stackoverflow.com/questions/52880/google-reader-api-unread-count

Rafael S. Calsaverini 2009-11-04 22:21:38

ansaurus

tags:

views:

answers:

Feedparser - retrieve old messages from Google Reader

related questions