views:

149

answers:

7

Hey guys / gals,

I am starting development of a windows service and would like to consult the braintrust at stackoverflow before getting too far into it as to the best way to handle it.

This will be my first windows service and I am not really familiar with threading, which is what I am assuming will be the recommendations here (but am eager to learn!).

The service will have the following functionality on a high level:

  • Monitor a folder for new files (planning to use FileSystemWatcher).
  • Upon detection of a file, it is queued for upload to an external host.
  • If there are files in the queue, serially HTTP POST those files to the external host.
    • The files have to be POST'ed one at a time (serially) and must be transferred using HTTP POST.
  • Upon successful HTTP POST, it will delete the local file and, if necessary, move to the next file in the upload queue and repeat the process.

The issue I can foresee even at this high level, is that the HTTP POST of the file to the external host could take a really long time.

What design options are available to best handle this long running aspect of the windows service? Should I be even looking at using a windows service at all as an implementation for this solution? Should I be looking into a standalone app instead?

Thanks in advance overflow'ers!

A: 

Well, if I understand you correctly, what you want to achieve is not that complicated.

I would go for a windows service and the FileSystemWatcher.

There is not much threading you need to do. The only thing I would thread is the file upload, which is easily done with a BackgroundWorker. By threading this, you can upload multiple files asynchronous.

Let me know if you need any more help.

alexn
+3  A: 

The windows service isn't a bad idea IMO, especially since you want it to run constantly attempting to detect file entry into a folder. The HTTP POST limitation is significant, but you're aware of the time and resources it will take up. I think your biggest concern is going to be queueing and resource management. You'll want to spin each of these transfers into a BackgroundWorker process so that multiple files can be completed independently, but you'll also want to have a management class that can limit the number of BackgroundWorker objects that can be spun up. Otherwise you'll run into memory management problems, network clogs, and who knows what else.

You should concern the worst/best case scenarios for files to appear in the folder. What's the largest number of files that couid appear simultaneously, what's the largest size a file could be, what happens when the folder starts to "back up" because the HTTP POST isn't delivering them to the destination fast enough. What happens when the destination host is unreachable? What happens when the system reboots "mid delivery"? Is there a source for determining priority of file delivery? Are there situations where a file delivery must be interrupted or transactionally reversed?

I think the Windows Service is the right choice, combined with the FileSystemWatcher. Just watch out for your resource usage.

Joel Etherton
The question states that the uploads have to be done one at a time. If so, only a single BackgroundWorker would be needed, and you don't have to worry as much about resources.
grossvogel
@grossvogel - I didn't make that assumption. By "one at a time", I took it to mean that the files could not be zipped and transferred together. A backgroundworker would still perform a "one at a time" transfer, assuming "one at a time" means 1 file per 1 HTTP POST. That's a problem for the OPs BAs though.
Joel Etherton
My mistake, I whould have been more clear. I did mean only one file can be transferred to the remote host at any given time. They only allow one concurrent connection. Sorry for the confusion guys...
nokturnal
A: 

A windows service is the way to go.

I don't forsee any issues as I've done nearly the exact same thing before, except I was connecting to a DB instead of an http service, and didn't have any issues.

You don't need multithreading, especially since you're POSTing each file one at a time.

It would be useful if you have an application to monitor your services and email/sms you when they go down.

Btw, your accept rate is pretty low. You should always mark the correct answer for the benefit of other readers.

HappyCoder4U
A: 

A Windows service is definitely the way to go. In your method for starting the service you will have to create the necessary FileSystemWatcher instances.

When new files are created events will be fired, and you will have to process these events in a timely manner. The event is executed on a thread from the thread pool, and future events may be lost if your event handler doesn't return immediately. This means that you will have to queue up some form of task. You can use the new in .NET 4 Task Parallel Library, a BackgroundWorker class, ThreadPool.QueueUserWorkItem method or something similar. In general these techniques all use the .NET thread pool that has a limited size to limit the amount of system resources your service will use.

Queueing a new task every time a new file is created will allow the tasks to execute in parallel. If you only want a single task to execute at a time you will have to place the tasks in a queue. You can use a volatile in-memory queue, but another approach would be to use a durable and transactional MSMQ queue. If the files are small enough to store in the queue you can read, enqueue and delete the file in a transactional manner. Another task will then have to dequeue files from the queue and process them. Any failure would roll back the transaction and keep the file in the queue. This would get around the problems of trying to use a file system as a transactional database.

If your files arrive at a fast pace you will have to handle the situation where events from the FileSystemWatcher are missed. An approach where the service at regular intervals (say once every minute) scan the file system may work out better for you. This can be done using a timer class (either System.Timers.Timer class or System.Threading.Timer class).

During startup you service will have to enumerate existing but unprocessed files and queue them up for processing.

If your service has to be very reliable you have to consider all possible failure scenarios like the service being terminated unexpectedly or a disk is full.

Martin Liversage
I wish I could vote this a solution as well, as it really helped me out as well. Thanks guys!
nokturnal
A: 

I've recently used the SmartThreadPool to manage concurrent FTP uploads in a Windows Service application.

The windows service uses Quartz.Net scheduling to fire off FTP upload jobs, but I needed to have 200 uploads completed in a short amount of time. Each individual upload was taking 15 minutes, but all 200 needed to be completed in under 2 hours.

When Quartz fired its scheduled events, I filled the SmartThreadPool with 200 class instances representing each FTP end point, letting the SmartThreadPool manage the resource usage. (we had trouble with Quartz triggers missing when our tasks ran by Quartz took a long time, long being a minute or more)

I found that I was easily able to scale up to 60 threads (highest I went) with nearly linear concurrency. That is, where 200 consecutive uploads would take almost exactly 50 hours to complete, letting the SmartThreadPool use up to 50 threads cut the whole process down to almost exactly 1 hour.

This method worked extremely well for us and I would recommend it.

qstarin
A: 

I agree with the others. Using a Windows service for this task makes a lot of sense.

The only guidance I would have is to avoid using the BackgroundWorker for your threading since you're doing this inside a Windows service. The BackgroundWorker class is designed to provide feedback on the progress of your threaded operation. Unless you plan to have a front-end application that receives feedback from your Windows service and then presents that info to the user (e.g., using a progress bar), a BackgroundWorker object is overkill for what you need. I would suggest using either the ThreadPool or Thread classes depending on the particulars of your situation. For some guidance as to which to choose, refer to the guidance here.

Matt Davis
A: 

FilesystemWatcher

The big issue with this sytem is that events will generally start to fire when the file system initially creates an entry in the directory and events will continue to fire as the file is written. However the general expectation is that the file system watcher will fire once the file has completely been written to the directory. This causes issues with large files which have not finished being transferred to the directory even though the filesystemwatcher has already started to fire off events.

A robust solution has to wrap the filesystemwatcher events with some sort of testing that the file has completed being written. I don't have any references handy at the moment but there are lots of solutions out there that show how to take care of this issue.

Peter M