views:

410

answers:

2

I am about to implement the archetypal FileSystemWatcher solution. I have a directory to monitor for file creations, and the task of sucking up created files and inserting the into a DB. Roughly this will involve reading and processing 6 or 7, 80 char text files that appear at a rate of 150mS in bursts that occur every couple of seconds, and rarely a 2MB binary file will also have to be processed. This will most likely be a 24/7 process.

From what I have read about the FileSystemWatcher object it is better to enqueue its events in one thread and then dequeue/process them in another thread. The quandary I have right now is what would be the better creation mechanism of the thread that does the processing. The choices I can see are:

  1. Each time I get a FSW event I manually create a new thread (yeah I know .. stupid architecture, but I had to say it).

  2. Throw the processing at the CLR thread pool whenever I get an FSW event

  3. On start up, create a dedicated second thread for the processing and use a producer/consumer model to handle the work. The main thread enqueues the request and the second thread dequeues it and performs the work.

I am tending towards the third method as the preferred one as I know the work thread will always be required - and also probably more so because I have no feel for the thread pool.

Can anyone offer any advice in this? Thanks

+2  A: 

If you know that the second thread will always be required, and you also know that you'll never need more than one worker thread, then option three is good enough.

Anon.
+1, I would add that using the thread pool will try and handle your requests simultaneously on multiple threads which doesn't sound like a good thing for your application.
John Knoeller
Anon .. From what testing I have done my processing should be done well and truly in the 150mS except in the case of the binary file processing - that will run at about 150mS but should be such a rare occurrence that there will be plenty of time to catch up if things get queued.
Peter M
A: 

Just be aware that FileSystemWatcher may miss events, there's no guarantee it will deliver all specific events that have transpired. Your design of keeping the work done by the thread receiving events to a minimum, should reduce the chances of that happening, but it is still a possibility, given the finite event buffer size (tops out at 64KB).

I would highly recommend developing a battery of torture tests if you decide to use FileSystemWatcher.

In our testing, we encountered issues with network locations, that changing the InternalBufferSize did not fix, yet when we encountered this scenario, we did not receive Error event notifications either.

Thus, we developed our own polling mechanism for doing so, using Directory.GetFiles, followed by comparing the state of the returned files with the previously polled state, ensuring we always had an accurate delta.

Of course, this comes at a substantial cost in performance, which may not be good enough for you.

Leon Breedt
Leon, I'm well aware of the FSW limitations and issues. It seems not robust on network shares. I'm only going to be using on a local directory and I don't expect the FSW event buffer size will cause me problems. I'm sort of planning on a sweeper process for just in case I miss some things.
Peter M
Leon .. BTW I will be planning a lot of tests .. FSW seems to have a huge number of hidden gotchas.
Peter M
If I were doing this, I would go for FSW, and run a full sweep over the directory every now and then (maybe daily, at a time the system is normally quiet?) to make sure everything gets caught.
Anon.