views:

740

answers:

2

I want to write my first real MultiThreaded C# Application. While I used a BackgroundWorker before and know a thing or two about lock(object), I never used the Thread object, Monitor.Enter etc. and I'm completely lost where to start designing the Architecture.

Essentially my program runs in the background. Every 5 Minutes, it checks a web service. If the web service returns data, it creates Jobs out of this data and passes it into a JobQueue. The JobQueue then sequentially works on those jobs - if a new job is added while it still is working on one, it will queue the job. Additionally, there is a Web Server to allow remote access to the program.

The way I see it, I need 4 Threads:

  1. The Main Thread
  2. The "5-Minute-Timer" and WebService Thread
  3. The JobQueue
  4. The Web Server

Thread 2-4 should be created when the program launches and ended when the program ends, so they only run once.

As said, i don't really know how the architecture would work on that. What would Thread 1 do? When the MyProgram class is instantiated, should it have a Queue<Job> as a Property? How would I start my Thread? As far as I see, I need to pass in a Function into the Thread - where should that function sit? If I have a class "MyJobQueueThreadClass" that has all the functions for Thread 3, how would that access an Object on the MyProgram class? And if a Thread is just a function, how do I prevent it from ending early? As said, Thread 2 waits 5 Minutes, then executes a series of functions, and restarts the 5 minute timer (Thread.Sleep(300)?) over and over again, until my Program is ended (Call Thread.Abort(Thread2) in the Close/Exit/Destructor of MyProgram?)

+2  A: 

You don't have to worry about the main program thread in terms of lifecycle - it's out of your hands.

You can set up a timer object (System.Threading.Timer) on the main thread, which elapses every 5 minutes - a thread from the .NET thread pool will be used to call back into your elapsed event handler.

I would use that thread to connect to the web service, download job data, and push jobs into the job queue as it is a unit of work. Once you have finished doing work with the thread, .NET will automatically put it back into the pool. The timer will keep sending elapsed events which repeats this process. So far you haven't actually needed to do any explicit threading, which is usually a good thing!

Then you want a thread that pops jobs out of the queue and processes them - you could implement this as a class that encapsulates a Thread instance using the worker thread pattern. Its function is to pop jobs off the job queue and process them, and to go to sleep for an interval when the work is done (the queue is empty). When the thread awakes it will resume the loop where it left off, or until it is signalled to stop by the main thread.

You could also do this using BackgroundWorker - but if you want to learn multi-threading then the first option will give you more insight into Thread.

This kind of pattern is quite common and is usually known as producer-consumer, and you can definitely google that for examples. The main complexity here is in synchronizing access to the queue as it is shared between the producer and consumer threads and you don't want them to step on each other's toes.

Sam
+12  A: 

Let's go through it, step by step:

1.

class Program {

The job queue is a data structure:

    private static Queue<Job> jobQueue;

If this data structure is accessed by multiple threads, you need to lock it:

    private static void EnqueueJob(Job job) {
        lock (jobQueue) {
            jobQueue.Enqueue(job);
        }
    }

    private static Job DequeueJob() {
        lock (jobQueue) {
            return jobQueue.Dequeue();
        }
    }

Let's add a method that retrieves a job from the web service and adds it to the queue:

    private static void RetrieveJob(object unused) {
        Job job = ... // retrieve job from webservice
        EnqueueJob(job);
    }

And a method that processes jobs in the queue in a loop:

    private static void ProcessJobs() {
        while (true) {
            Job job = DequeueJob();
            // process job
        }
    }

Let's run this program:

    private static void Main() {
        // run RetrieveJob every 5 minutes using a timer
        Timer timer = new Timer(RetrieveJob);
        timer.Change(TimeSpan.FromMinutes(0), TimeSpan.FromMinutes(5));

        // run ProcessJobs in thread
        Thread thread = new Thread(ProcessJobs);
        thread.Start();

        // block main thread
        Console.ReadLine();
    }
}

2.

If you run the program, you'll notice that a job is added every 5 minutes. But jobQueue.Dequeue() will throw an InvalidOperationException because the job queue is empty until a job is retrieved.

To fix that, we turn the job queue into a blocking queue by using a Semaphore:

    private static Semaphore semaphore = new Semaphore(0, int.MaxValue);

    private static void EnqueueJob(Job job) {
        lock (jobQueue) {
            jobQueue.Enqueue(job);
        }
        // signal availability of job
        semaphore.Release(1);
    }

    private static Job DequeueJob() {
        // wait until job is available
        semaphore.WaitOne();
        lock (jobQueue) {
            return jobQueue.Dequeue();
        }
    }

3.

If you run the program again, it won't throw the exception and everything should work fine. But you'll notice that you have to kill the process because the ProcessJobs-thread never ends. So, how to you end your program?

I recommend you define a special job that indicates the end of job processing:

    private static void ProcessJobs() {
        while (true) {
            Job job = DequeueJob();
            if (job == null) {
                break;
            }
            // process job
        }
        // when ProcessJobs returns, the thread ends
    }

Then stop the timer and add the special job to the job queue:

    private static void Main() {
        // run RetrieveJob every 5 minutes using a timer
        Timer timer = new Timer(RetrieveJob);
        timer.Change(TimeSpan.FromMinutes(0), TimeSpan.FromMinutes(5));

        // run ProcessJobs in thread
        Thread thread = new Thread(ProcessJobs);
        thread.Start();

        // block main thread
        Console.ReadLine();

        // stop the timer
        timer.Change(Timeout.Infinite, Timeout.Infinite);

        // add 'null' job and wait until ProcessJobs has finished
        EnqueueJob(null);
        thread.Join();
    }

I hope this implicitly answers all your questions :-)

Rules of thumb

  • Start a thread by specifying a method that has access to all necessary data structures

  • When accessing data structures from multiple threads, you need to lock the data structures

    • In most cases the lock statement will do
    • Use a ReaderWriterLockSlim if there are many threads reading from a data structure that is infrequently changed.
    • You don't need a lock if the data structure is immutable.
  • When multiple threads depend on each other (e.g., a thread waiting for another thread to complete a task) use signals

  • Do not use Thread.Abort, Thread.Interrupt, Thread.Resume, Thread.Sleep, Thread.Suspend, Monitor.Pulse, Monitor.Wait

dtb
+1 for the **thread.Join()** I have seen lots of people forgetting that. Well explained.
KMan
i'd make start and stop explicit methods instead of using a null signal. otherwise, very good answer
Isaac Cambron
Wow, many Thanks! Reading through this now, but that looks very very helpful!
Michael Stum
Thread.Abort is kinda harsh, I guess? (The equivalent of killing a Trask from the Task Manager?) How would I end my WebServer thread gracefully, which normally doesn't listen to the JobQueue? (It does interact with some other parts of the application, but it mostly just runs along and does it's own thing)
Michael Stum
Literate programming in C# … +1.
Konrad Rudolph