views:

19

answers:

1

I'm trying to do a RSS reader but I have no idea on how to identify the unread items. For example what should I do when I refresh my list to prevent having duplicates?

+1  A: 

Despite being optional, most RSS feeds provide a 'guid' element for each item, which is a string that uniquely identifies it.

If the feed you're parsing provides such element, you can save the already processed items by storing the GUID somewhere, and then when you fetch the feed, you can verify for each item if you already stored that GUID. But remember to also save the published date, as an item may have been updated in the mean time.

Unfortunately, the GUID element is not mandatory, so if the feed doesn't provide it, you may have to recourse to a combination of the title and description to check them. My suggestion would be hashing the description using SHA-1 or MD5 and then checking the new item's description against the saved hashes.

André Paramés