tags:

views:

40

answers:

1

I'm writing my own RSS app in Csharp, and I'm wondering what would be a good method for seeing previously fetched RSS items (so it only fetches the new ones)? Currently, my app just refetches all the RSS Items and transfers them to Email one by one (individual emails for each Item). Do an MD5 hash of the title and compare it to all the other MD5 hashes (stored in a file)? I'm short on a good idea.

Edit: It can't be related to the Date field in the RSS item, as a lot of RSS feeds don't have a pubDate field.

+1  A: 

I haven't explored RSS that much (and know that different sites do various strange things), but I'd try something like: Use the "guid" if it's there, and if not, use the "link". You could hash it to either save space, or add some privacy - but I'd probably start out not hashing them to make debugging easier, and hasing later when I saw that the algorithm was working, and if there were a need for it.

P.S. The "link" alone may be good enough for most sites (and probably easier to implement), since if this were repeated, it would effectively be different articals pointing to the same link, which I suspect is not done too much. But again, I don't speak here from experience.

Andy Jacobs