views:

711

answers:

8

I have what I assume is a pretty common threading scenario:

  • I have 100 identical jobs to complete
  • All jobs are independent of each other
  • I want to process a maximum of 15 jobs at a time
  • As each job completes, a new job will be started until all jobs have been completed

If you assume that each job will fire an event when he completes (I'm using the BackgroundWorker class), I can think of a couple of ways to pull this off, but I'm not sure what the "right" solution is. I was hoping some of you gurus out there could point me in the right direction.

SOLUTION 1: Have a while(continue) { Threading.Sleep(1000); } loop in my Main() function. The code in the Job_Completed event handler would set continue = false when A) no jobs remain to be queued and B) all queued jobs have completed. I have used this solution before and while it seems to work fine...it seems a little "odd" to me.

SOLUTION 2: Use Application.Run() in my Main() function. Similarly, the code in the Job_Completed event handler would call Application.Exit() when A) no jobs remain to be queued and B) all queued jobs have completed.

SOLUTION 3: Use a ThreadPool, queue up all 500-1000 requests, let them run 10 at a time (SetMaxThreads) and somehow wait for them all to complete.

In all of these solutions, the basic idea is that a new job would be started every time another job is completed, until there are no jobs left. So, the problem is not only waiting for existing jobs to complete, but also waiting until there are no longer any pending jobs to start. If ThreadPool is the right solution, what is the correct way to wait on the ThreadPool to complete all queued items?

I think my overriding confusion here is that I don't understand exactly HOW events are able to fire from within my Main() function. Apparently they do, I just don't understand the mechanics of it from a Windows message loop point-of-view. What is the correct way to solve this problem, and why?

A: 

Here's the pseudocode of how I would approach it (this doesn't leverage ThreadPool, so someone might have a better answer:)

main
{
    create queue of 100 jobs
    create new array of 15 threads
    start threads, passing each the job queue
    do whatever until threads are done
}

thread(queue)
{
    while(queue isn't empty)
    {
        lock(queue) { if queue still isn't empty dequeue a thing }
        process the thing
    }

    queue is empty so exit thread
}

EDIT: If your issue is how to tell when the threads are finished, and you're using normal C# threads, (not ThreadPooled threads) you can call Thread.Join() on each thread with an optional timeout and it will return only once the thread is done. If you want to keep track of just how many threads are done without getting hung up on one, you can cycle through them in a way like this:

for(int i = 0; allThreads.Count > 0; i++)
{
    var thisThread = allThreads[i % threads.Count];
    if(thisThread.Join(timeout)) // something low, maybe 100 ms or something
        allThreads.Remove(thisThread);
}
mquander
+2  A: 

Re: "somehow wait for them all to complete"

ManualResetEvent is your friend, before you start your big batch create one of these puppies, in your main thread wait on it, set it at the end of the background operation when the job is done.

Another option is to manually create the threads and do a foreach thread, thread.Join()

You could use this (I use this during testing)

     private void Repeat(int times, int asyncThreads, Action action, Action done) {
        if (asyncThreads > 0) {

            var threads = new List<Thread>();

            for (int i = 0; i < asyncThreads; i++) {

                int iterations = times / asyncThreads; 
                if (i == 0) {
                    iterations += times % asyncThreads;                    
                }

                Thread thread = new Thread(new ThreadStart(() => Repeat(iterations, 0, action, null)));
                thread.Start();
                threads.Add(thread);
            }

            foreach (var thread in threads) {
                thread.Join();
            }

        } else {
            for (int i = 0; i < times; i++) {
                action();
            }
        }
        if (done != null) {
            done();
        }
    }

Usage:

// Do something 100 times in 15 background threads, wait for them all to finish.
Repeat(100, 15, DoSomething, null)
Sam Saffron
+1  A: 

I would just use the Task Parallel Library.

You can do this as a single, simple Parallel.For loop with your tasks, and it will automatically manage this fairly cleanly. If you can't wait for C# 4 and Microsoft's implementation, a temporary workaround is to just compile and use the Mono Implementation of TPL. (I personally prefer the MS implementation, especially the newer beta releases, but the Mono one is functional and redistributable today.)

Reed Copsey
A: 

When you queue a work item in the thread queue, you should get a waithandle back. Put them all in an array and you can pass it as an argument to the WaitAll() function.

Joel Coehoorn
Good idea, but how would you do this? QueueUserWorkItem returns a bool.
Groky
+1  A: 

I would use ThreadPool.

Before you start running your jobs, create a ManualResetEvent and an int counter. Add each job to the ThreadPool, incrementing the counter each time.

At the end of each job, decrement the counter and when it hits zero, call Set() on the event.

In your main thread, call WaitOne() to wait for all of the jobs to be completed.

Groky
+2  A: 

Even though the other answers are nice if you want another option (you can never have enough options), then how about this as an idea.

Just put the data for each job into a structure, which is in a FIFO stack.

Create 15 threads.

Each thread will get the next job from the stack, popping it off.

When a thread finishes the processing, get the next job, if the stack is empty the thread dies or just sleeps, waiting.

The only complexity, which is pretty simple to resolve, is having the popping be in a critical section (synchronize read/pop).

James Black
A: 

ThreadPool might be the way to go. The SetMaxThreads method would be able to restrict the number of threads which are being executed. However, this restricts the max number of threads for the process/AppDomain. I wouldn't suggest using SetMaxThreads if the process is running as a service.

private static ManualResetEvent manual = new ManualResetEvent(false);
private static int count = 0;

public void RunJobs( List<JobState> states )
{
     ThreadPool.SetMaxThreads( 15, 15 );

     foreach( var state in states )
     {
          Interlocked.Increment( count );
          ThreadPool.QueueUserWorkItem( Job, state );
     }

    manual.WaitOne();
}

private static void Job( object state )
{
    // run job

    Interlocked.Decrement( count );
    if( Interlocked.Read( count ) == 0 ) manual.Set();
}
smaglio81
A: 

After a bit more research, I think developing the application as a Windows service will end up being the best solution. Thanks for all your help and suggestions.

Casey Gay
You still have to solve the problem.
C. Ross