tags:

views:

72

answers:

2

I want to check for updates on a RSS feed. Is there any way to do it without downloading the complete XML? (want to minimize data transfer...)

Thanks

+1  A: 

Depending on the source you're looking at, it should be possible to do a head request and check the last modified date. You would have to keep track of the last time you updated on your end, but if the main thing you're worried about is total bandwidth usage, I think this is probably your best bet, although you'd would still have to make a normal request to get the actual file if you detected that the new last modified date is after your saved version.

Methods of performing a head request for a URL will differ depending on the language that you're using.

A quick example in .NET

HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
// instruct the server to return headers only
request.Method = "HEAD";
// make the connection
response = request.GetResponse() as HttpWebResponse;
// get the headers
headers = response.Headers;
Gregg
Interesting Gregg, can you provide an example in any language of your preference? or a link to a good example?
tekBlues
I'd take a good hard look at the HTTP specs with special emphasis on the ETag and the time related HTTP headers.
ndim
+1. Good suggestion. @tekBlues, I see you've asked questions about ASP.NET in the past so I'll assume that's what you're using. Here's an example of a HEAD request in .NET: http://www.eggheadcafe.com/tutorials/aspnet/2c13cafc-be1c-4dd8-9129-f82f59991517/the-lowly-http-head-reque.aspx
Steve Wortham
A: 

If you have control of the feed you're requesting, you can ensure that it includes ETags to compare with last request - your best option. Read this: http://www.kbcafe.com/rss/rssfeedstate.html

Ben
Ben, if I understand you, I have no control, because I want it to be able to check any RSS feed... anyway interesting link, thanks
tekBlues
Gotcha. Well, the sort of optimization you're looking for would depend on the feed provider providing helpful HTTP headers. Try `curl -I http://www.scripting.com/rss.xml`. The ETag and Last-Modified are your friends here. Other sites aren't as helpful - try `curl -I http://stackoverflow.com/feeds/tag/rss`.
Ben