I'm trying to come up with a design for a thread pool with a lot of design requirements for my job. This is a real problem for working software, and it's a difficult task. I have a working implementation but I'd like to throw this out to SO and see what interesting ideas people can come up with, so that I can compare to my implementation and see how it stacks up. I've tried to be as specific to the requirements as I can.
The thread pool needs to execute a series of tasks. The tasks can be short running (<1sec) or long running (hours or days). Each task has an associated priority (from 1 = very low to 5 = very high). Tasks can arrive at any time while the other tasks are running, so as they arrive the thread pool needs to pick these up and schedule them as threads become available.
The task priority is completely independant of the task length. In fact it is impossible to tell how long a task could take to run without just running it.
Some tasks are CPU bound while some are greatly IO bound. It is impossible to tell beforehand what a given task would be (although I guess it might be possible to detect while the tasks are running).
The primary goal of the thread pool is to maximise throughput. The thread pool should effectively use the resources of the computer. Ideally, for CPU bound tasks, the number of active threads would be equal to the number of CPUs. For IO bound tasks, more threads should be allocated than there are CPUs so that blocking does not overly affect throughput. Minimising the use of locks and using thread safe/fast containers is important.
In general, you should run higher priority tasks with a higher CPU priority (ref: SetThreadPriority). Lower priority tasks should not "block" higher priority tasks from running, so if a higher priority task comes along while all low priority tasks are running, the higher priority task will get to run.
The tasks have a "max running tasks" parameter associated with them. Each type of task is only allowed to run at most this many concurrent instances of the task at a time. For example, we might have the following tasks in the queue:
- A - 1000 instances - low priority - max tasks 1
- B - 1000 instances - low priority - max tasks 1
- C - 1000 instances - low priority - max tasks 1
A working implementation could only run (at most) 1 A, 1 B and 1 C at the same time.
It needs to run on Windows XP, Server 2003, Vista and Server 2008 (latest service packs).
For reference, we might use the following interface:
namespace ThreadPool
{
class Task
{
public:
Task();
void run();
};
class ThreadPool
{
public:
ThreadPool();
~ThreadPool();
void run(Task *inst);
void stop();
};
}