views:

117

answers:

4

Hi I would like to monitor many web pages/ rss feeds at the same time and poll them at a regular frequency ( they may all have different update frequencies ). I am thinking about creating a thread for each source I want to mirror that will loop infinitely and then sleep till next update after dealing with the fetched data .

Does someone whould have a better idea or an exemple of how to do it?

Thx Dave

A: 

Use a timer to kick in each 1 (or 5) minutes. In the timer callback, loop trough the urls you need to check, and verify if they are due to be checked (as you put in the comment, they will have different sync times). You can prepare a proper structure to hold the urls and their timeouts, as well as the last time they since.

If an url is OKed to sync (it's time has elapsed), start an async HttpWebRequest to fetch it. That way, you offload all the receiving part to a threadpool thread, so it does not affect your main timer callback thread.

Be careful - if you do a heavy processing on the responses, you might want to start regular thread in the HttpWebRequest callback to make the additional processing, or implement some kind of queue, so you free the threadpool thread as soon as possible.

Here is a good explanation how to make async requests: http://www.devnewsgroups.net/group/microsoft.public.dotnet.framework/topic23172.aspx

You can google for more examples as well, but this is a good start.

Sunny
I didn't think to timers , they are actually quiet good ( and advertise by all the answers so far ) but sources shouldnt always update at the same frequency or at the same time .
Dave
You did not state that each resource will have it's own timeout. But still - you can have many timers - try to group many resources in one timer with similar timeouts. Also, if you are going to do a heavy processing of the responses, then don't do it in the callback of the async request, instead, then start a thread to process the results, so the treadpool threads are freed soon enough.
Sunny
response changed to reflect the new requirements. Please, also, edit your question.
Sunny
A: 

Why not just synchronize it on one clock, such as having them all update on the 10 during every hour (10, 20, 30, etc) instead of having all your threads updating randomly during a 10 minute period. Why do you need to create one thread per page/feed?

yx
A: 

Use the Timer object to fire off a process using a BackgroundWorkerThread object so that you can process things in the background. Depending on the number of feeds you have, you may consider doing a "staggered" update over a shorter interval. Say every 5 minutes, the worker thread starts up, goes to the next feed in a list of feeds to monitor and checks that for updates.

As I'm sure you've seen from some feed readers out there, updating ALL your feeds at once isn't always the nicest solution since it can tend to freeze up the user interface for a bit.

Dillie-O
A: 

I created a window service to accomplish what your are describing. Every n minutes, the daemon wakes up, reads an XML file with the urls it needs to fetch, process all the data and goes to sleep again for n minutes. I had one thread for fetching the data and another that was monitoring the XML file for changes. The XML file could get updated through a web interface.

As yx pointed out, it is not necessary to create one thread per page, however, if you have a lot of urls to fecth, you could distribute your urls into packages of 100 (for example) and then create one thread for every package. You would then have to wait for the last thread to finish before sending the daemon to sleep again.

jdecuyper