views:

102

answers:

3

I have a design question. I want some feedback to know if a ThreadPool is appropriate for the client program I am writing.

I am having a client running as a service processing database records. Each of these records contains connection information to external FTP sites [basically it is a queue of files to transfer]. A lot of them are to the same host, just moving different files. Therefore, I am grouping them together by host. I want to be able to create a new thread per host. I really don't care when the transfers finish, they just need to do all the work (or try to do) they were assigned, and then terminate once they are finished, cleaning up all resources they used in the process.

I anticipate no more than 10-25 connections to be established. Once the transfer queue is empty, the program will simply wait until there are records in the queue again.

Is the ThreadPool a good candidate for this or should I use a different approach?

Edit: For the most part, this is the only significant custom application running on the server.

+2  A: 

From what you have described, it sounds like the threadpool would be a good fit.

Issues:

  1. A threadpool thread will not keep your process alive on shutdown. Make sure that's the behavior you want.

  2. In older reading, tying up the threadpool with long-runnign tasks when the app might be waiting on incomping connections (like a web app) could be bad. However, it sounds like you have a dedicated windows service running, so I don't think this is an issue.

  3. Just because you throw 10 jobs at the thread pool does not mean that it will immediately dispatch 10 threads to do the work -- you are delegating the decision of how many threads to use to .net and the o/s.

JMarsch
+1 Agree with all of your reasoning, threadpool is a great fit.
Walter
+3  A: 

No, the thread pool isn't appropriate. The thread pool is really designed for "short tasks that require background processing," since the framework depends on the availability of thread pool threads, and long-running processes can exhaust the thread pool.

Ftp transfers take a relatively long time (even with a reasonable timeout), so they aren't really a good fit. You may be able to get by using the thread pool, but you may also find yourself running into inexplicable bugs if you use it. It depends how much your application uses thread-pool dependent framework features (asynchronous delegates, etc.).

The MSDN topic "The Managed Thread Pool" offers good guidelines for when not to use thread pool threads:

There are several scenarios in which it is appropriate to create and manage your own threads instead of using thread pool threads:

  • You require a foreground thread.
  • You require a thread to have a particular priority.
  • You have tasks that cause the thread to block for long periods of time. The thread pool has a maximum number of threads, so a large number of blocked thread pool threads might prevent tasks from starting.
  • You need to place threads into a single-threaded apartment. All ThreadPool threads are in the multithreaded apartment.
  • You need to have a stable identity associated with the thread, or to dedicate a thread to a task.
Jeff Sternal
Thanks for your input. Do you suggest I just create Threads and manage them myself?
ZeroVector
Indeed, creating the threads and managing them yourself is the way to go in the long term. The cost of spinning up threads will be insignificant compared to the FTP transfer time, which moots one of the major benefits of thread pooling (the other being the simplified interface `QueueUserWorkItem` offers).
Jeff Sternal
+1  A: 

A ThreadPool would be nice as it allows you to concentrate on setting up jobs queued to threads, instead of worrying about initializing and cleaning up individual threads.

But how do you want this to work? Are you going to queue multiple jobs to the pool for a host, or are you going to have a thread for each host that reads jobs from its own queue?

Justin Ethier
There is one single database table for the queue, and there is a column by machine name. The client service will query by it's own machine name. Then, from that subset of records (let's say 100 records returned), it will be grouped by host (let's say it returned 4 unique hosts out of the collection of 100 records). So, I would like to have 4 threads in the pool started for each of those hosts.
ZeroVector