views:

175

answers:

1

I would like to know what's the best way to fetch RSS feeds in real time without having to download the entire feed even when it hasn't been changed. I don't really mind the language, I'm just looking for the best way to do that.

+2  A: 

You can use ETag and If-Modified-Since header HTTP header parameters.

Here is a sample python code:

etag = ... # etag of previous request
last_modifier = ... # time of last request

req = urllib2.Request(url)
if etag:
    req.add_header("If-None-Match", etag)

if last_modified:
    req.add_header("If-Modified-Since", last_modified)

opener = urllib2.build_opener(NotModifiedHandler())
url_handle = opener.open(req)
headers = url_handle.info()

if hasattr(url_handle, 'code') and url_handle.code == 304:
    # no change happened
else:
    # RSS Feed has changed

The code can be transferred to any language where you just add the necessary header tags and check the returned code.

UPDATE: Checkout this blog entry: HTTP Conditional GET for RSS Hackers

notnoop