views:

104

answers:

2

My company is starting work on building a web-based RSS reader that users can sign up to and track feeds; a lot like Google Reader.

My first thought was that once I have a feed URL for a certain blog or website, I'd only have to poll it once in order to grab the content and then insert entries into the database for anyone who subscribes to it.

However; if someone is using a service like FeedBurner to track reader statistics, 100 readers could be subscribed to a particular blog or website and it would only show up as 1 reader to the actual author.

Polling once for each subscriber would be a huge unnecessary overhead, especially if a thousand users are subscribed to the same feed. Do you have any suggestions, or is the only solution to redundantly poll the same feed hundreds of times in quick succession?

Thanks for your input!

+10  A: 

As far as I know, Google Reader addresses the problem in this way: The user agent string of their client includes the number of subscribers reading this feed through their reader.

I don't know if FeedBurner or other tools interpret this, but it is at least theoretically possible to get accurate statistics from the http log files this way.

Edit:

According to the official Google Reader documentation the User-Agent header of their feed fetcher looks like this:

User-Agent: Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 4 subscribers; feed-id=1794595805790851116)
Christian Berg
Is that documented anywhere?
rmh
Yes, see my edit above.
Christian Berg
+1 for referencing an already implemented solution
cbrulak
Brilliant, thanks! I'll report subscriber counts in the same way in our app then.
rmh
Other web based feed readers use a similar solution.
Dave Hinton
A: 

Actually FeedBurner tries to make a smart guess. Here's a nice post on how involved good readership estimation can be.

mdm