views:

712

answers:

2

I am noticing some strange behaviour with BackgroundWorkers and the events they're firing where the events seem to be queuing up in one thread while the CPU isn't actually being used.

Basically the design of the system is that, based on user interaction, a thread is created to send off a web request to fetch some data. Based on the results, it may fire off many other asynchronous requests, using BackgroundWorkers for each of these. I am doing this because the code that manages the requests uses a lock to ensure only one request is being sent at a time (to avoid spamming the server with multiple simultaneous requests, possibly leading to the server ignoring / blocking them). There may be a better design to this, which I'd love to hear (I'm relatively new to C# / Windows Forms programming and could use the advice). However, regardless of the design changes I'm interested to learn what's causing the behaviour I'm seeing.

I've written a relatively simple test app to demonstrate the issue. It's basically just a form with a button and a text box to display the results (you could probably do it without the form and just display results on the console, but I did it this way to replicate what my actual app does). Here's the code:

delegate void AddToLogCallback(string str);

private void AddToLog(string str)
{
    if(textBox1.InvokeRequired)
    {
        AddToLogCallback callback = new AddToLogCallback(AddToLog);
        Invoke(callback, new object[] { str });
    }
    else
    {
        textBox1.Text += DateTime.Now.ToString() + "   " + str + System.Environment.NewLine;
        textBox1.Select(textBox1.Text.Length, 0);
        textBox1.ScrollToCaret();
    }
}

private void Progress(object sender, ProgressChangedEventArgs args)
{
    AddToLog(args.UserState.ToString());
}

private void Completed(object sender, RunWorkerCompletedEventArgs args)
{
    AddToLog(args.Result.ToString());
}

private void DoWork(object sender, DoWorkEventArgs args)
{
    BackgroundWorker worker = sender as BackgroundWorker;
    lock (typeof(Form1)) // Ensure only a single request at a time
    {
        worker.ReportProgress(0, "Start");
        Thread.Sleep(2000); // Simulate waiting on the request
        worker.ReportProgress(50, "Middle");
        Thread.Sleep(2000); // Simulate handling the response from the request
        worker.ReportProgress(100, "End");
        args.Result = args.Argument;
    }
}

private void button1_Click(object sender, EventArgs e)
{
    Thread thread = new Thread(RunMe);
    thread.Start();
}

private void RunMe()
{
    for(int i=0; i < 20; i++)
    {
        AddToLog("Starting " + i.ToString());
        BackgroundWorker worker = new BackgroundWorker();
        worker.WorkerReportsProgress = true;
        worker.DoWork += DoWork;
        worker.RunWorkerCompleted += Completed;
        worker.ProgressChanged += Progress;
        worker.RunWorkerAsync(i);
    }
}

Here are the results I'm getting back:

30/07/2009 2:43:22 PM   Starting 0
30/07/2009 2:43:22 PM   Starting 1
<snip>
30/07/2009 2:43:22 PM   Starting 18
30/07/2009 2:43:22 PM   Starting 19
30/07/2009 2:43:23 PM   Start
30/07/2009 2:43:36 PM   Middle
30/07/2009 2:43:36 PM   End
30/07/2009 2:43:36 PM   0
30/07/2009 2:43:36 PM   Start
30/07/2009 2:43:36 PM   Middle
30/07/2009 2:43:36 PM   End
30/07/2009 2:43:36 PM   1
30/07/2009 2:43:36 PM   Start
30/07/2009 2:43:36 PM   Middle
30/07/2009 2:43:36 PM   End
30/07/2009 2:43:36 PM   8
30/07/2009 2:43:36 PM   Start
30/07/2009 2:43:36 PM   Middle
30/07/2009 2:43:38 PM   13
30/07/2009 2:43:38 PM   End
30/07/2009 2:43:38 PM   Start
30/07/2009 2:43:40 PM   Middle
30/07/2009 2:43:42 PM   18
30/07/2009 2:43:42 PM   Start
30/07/2009 2:43:42 PM   End
30/07/2009 2:43:44 PM   Middle
30/07/2009 2:43:46 PM   End
30/07/2009 2:43:46 PM   2
30/07/2009 2:43:46 PM   Start
30/07/2009 2:43:48 PM   Middle

As you can see, there is a 13 second delay after the first 'Start' message is displayed, after which it then processes ~15 messages (despite there being a 2s delay between most of them being fired).

Anyone know what's going on?

+1  A: 

EDIT: Okay, I'm starting from scratch. Here's a short but complete console app which shows the issue. It logs the time of the message and the thread it's on:

using System;
using System.Threading;
using System.ComponentModel;

class Test
{
    static void Main()
    {
        for(int i=0; i < 20; i++)
        {
            Log("Starting " + i);
            BackgroundWorker worker = new BackgroundWorker();
            worker.WorkerReportsProgress = true;
            worker.DoWork += DoWork;
            worker.RunWorkerCompleted += Completed;
            worker.ProgressChanged += Progress;
            worker.RunWorkerAsync(i);
        }
        Console.ReadLine();
    }

    static void Log(object o)
    {
        Console.WriteLine("{0:HH:mm:ss.fff} : {1} : {2}",
            DateTime.Now, Thread.CurrentThread.ManagedThreadId, o);
    }

    private static void Progress(object sender,
                                 ProgressChangedEventArgs args)
    {
        Log(args.UserState);
    }

    private static void Completed(object sender,
                                  RunWorkerCompletedEventArgs args)
    {
        Log(args.Result);
    }

    private static void DoWork(object sender, DoWorkEventArgs args)
    {
        BackgroundWorker worker = (BackgroundWorker) sender;
        Log("Worker " + args.Argument + " started");
        lock (typeof(Test)) // Ensure only a single request at a time
        {
            worker.ReportProgress(0, "Start");
            Thread.Sleep(2000); // Simulate waiting on the request
            worker.ReportProgress(50, "Middle");
            Thread.Sleep(2000); // Simulate handling the response
            worker.ReportProgress(100, "End");
            args.Result = args.Argument;
        }
    }
}

Sample output:

14:51:35.323 : 1 : Starting 0
14:51:35.328 : 1 : Starting 1
14:51:35.330 : 1 : Starting 2
14:51:35.330 : 3 : Worker 0 started
14:51:35.334 : 4 : Worker 1 started
14:51:35.332 : 1 : Starting 3
14:51:35.337 : 1 : Starting 4
14:51:35.339 : 1 : Starting 5
14:51:35.340 : 1 : Starting 6
14:51:35.342 : 1 : Starting 7
14:51:35.343 : 1 : Starting 8
14:51:35.345 : 1 : Starting 9
14:51:35.346 : 1 : Starting 10
14:51:35.350 : 1 : Starting 11
14:51:35.351 : 1 : Starting 12
14:51:35.353 : 1 : Starting 13
14:51:35.355 : 1 : Starting 14
14:51:35.356 : 1 : Starting 15
14:51:35.358 : 1 : Starting 16
14:51:35.359 : 1 : Starting 17
14:51:35.361 : 1 : Starting 18
14:51:35.363 : 1 : Starting 19
14:51:36.334 : 5 : Worker 2 started
14:51:36.834 : 6 : Start
14:51:36.835 : 6 : Worker 3 started
14:51:37.334 : 7 : Worker 4 started
14:51:37.834 : 8 : Worker 5 started
14:51:38.334 : 9 : Worker 6 started
14:51:38.836 : 10 : Worker 7 started
14:51:39.334 : 3 : Worker 8 started
14:51:39.335 : 11 : Worker 9 started
14:51:40.335 : 12 : Worker 10 started
14:51:41.335 : 13 : Worker 11 started
14:51:42.335 : 14 : Worker 12 started
14:51:43.334 : 4 : Worker 13 started
14:51:44.335 : 15 : Worker 14 started
14:51:45.336 : 16 : Worker 15 started
14:51:46.335 : 17 : Worker 16 started
14:51:47.334 : 5 : Worker 17 started
14:51:48.335 : 18 : Worker 18 started
14:51:49.335 : 19 : Worker 19 started
14:51:50.335 : 20 : Middle
14:51:50.336 : 20 : End
14:51:50.337 : 20 : Start
14:51:50.339 : 20 : 0
14:51:50.341 : 20 : Middle
14:51:50.343 : 20 : End
14:51:50.344 : 20 : 1
14:51:50.346 : 20 : Start
14:51:50.348 : 20 : Middle
14:51:50.349 : 20 : End
14:51:50.351 : 20 : 2
14:51:50.352 : 20 : Start
14:51:50.354 : 20 : Middle
14:51:51.334 : 6 : End
14:51:51.335 : 6 : Start
14:51:51.334 : 20 : 3
14:51:53.334 : 20 : Middle

(etc)

Now trying to work out what's going on... but it's important to note that the worker threads are starting 1 second apart.

EDIT: Further investigation: If I call ThreadPool.SetMinThreads(500, 500) then even on my Vista box, it shows the workers all starting basically together.

What happens on your box if you try the above program, with and without the call to SetMinThreads? If it helps in this case but not your real program, could you produce a similarly short but complete program which shows it still being a problem even with a SetMinThreads call?


I believe I understand it. I think ReportProgress is adding a new ThreadPool task to process the message... and at the same time, you're busy adding 20 tasks to the thread pool. Now the thing about the thread pool is that if there aren't enough threads available to service a request as soon as it arrives, the pool waits for half a second before creating a new thread. This is to avoid creating a huge bunch of threads for a set of requests which could easily be processed on one thread if you'd just wait for an existing task to finish.

So for 10 seconds, you're just adding tasks to a long queue and creating a new thread every half second. The 20 "main" tasks are all relatively long ones, whereas the ReportProgress tasks are very short - so as soon as you've got enough threads to be processing all the long-running requests and a single short one, you're away and all the messages come through quickly.

If you add a call to

ThreadPool.SetMaxThreads(50, 50);

before all this starts, you'll see it behave as you expect it to. I'm not suggesting you should necessarily do this for your actual application, but just to show the difference. This creates a bunch of threads in the pool to start with, just waiting for requests.

One comment on your design: you've got 20 different tasks on different threads, but only one of them can actually occur at a time (due to the lock). You're effectively serializing the requests anyway, so why use multiple threads? I'm hoping your real application doesn't have this problem.

Jon Skeet
I tried setting the thread pool as suggested however it still has the same issue (ie even tried up to 200 max threads of each and it still showed the same symptoms). As I understood it though the ReportProgress events (along with the others) are executed in the thread that created the BackgroundWorker - is this not the case?As for the main application, the idea is that these requests can come in from multiple threads. Ie, threads A, B and C are generating asynchronous web requests through a WebRequestManager class which then serialises them.
Andrew
@Andrew: Interesting. Which version of .NET are you using, and which operating system? It makes a complete difference for me in .NET 3.5 on XP. Although theProgressChanged handler is indeed executed on the WinForms thread, the ReportProgress method itself is asynchronous, and I believe it uses a ThreadPool task to avoid blocking while it waits for the WinForms thread to handle the progress changed event.
Jon Skeet
I'm not sure why creating the extra threads upfront isn't working for you, but I'm reasonably sure that's what's going on.
Jon Skeet
I've tried the same application in .NET 3.5 on Windows Server 2003 (work machine) and on Vista Ultimate 64 bit (home). They both exhibit exactly the same delays. I also just tried .NET 2.0 on Windows Server 2003 and it still has delays. Your comments regarding the ReportProgress method sound correct, but as before, I've tried increasing the ThreadPool size and still no success, so I'm not quite sure what else to do with it...
Andrew
Hmm. I'll try to reproduce it on my Vista box tomorrow, and see what I can do.
Jon Skeet
I just tried retrieving the number of available threads in during DoWork, and there are still 180+ worker threads available the entire time (after I've set the max to 200), so it definitely doesn't seem to be running out of threads to process messages.
Andrew
Finally got a chance to test your latest suggestion - using SetMinThreads works for me as well. So I guess the ThreadPool only has a certain number of threads available to start with, and there appears to be a throttle of some kind for creating new threads, whereas if you force their creation from the start with SetMinThreads there's no delay. Thanks very much for working this one out for me. I've since switched the code over to use a queue for processing the requests, but I always like knowing the cause of things like this, even if they're no longer an issue for me.
Andrew
I'm cross with myself now... I'm pretty sure SetMaxThreads was a typo from the start... I suspect I meant SetMinThreads right from the first post, and that that's what had worked for me...
Jon Skeet
+1  A: 

The BackgroundWorker class will issue it's callbacks on the creating thread, this is extremely handy for UI tasks as you don't need to do the extra check on InvokeRequired followed by Invoke() or BeginInvoke().

The downside is that if your creating code is blocking or in a tight loop your callbacks are queued up.

The solution is to manage your threads yourself. You've already shown that you know how to create a thread manually, although you might want to look at the ThreadPool topic for more information on doing it.

UPDATE: here's a working sample based on feedback that uses a Queue and a custom SingletonWorker thread.

public partial class Form1 : Form
{
    public Form1()
    {
        InitializeComponent();
        SingletonWorker.ProgressHandler = Progress;
        SingletonWorker.CompleteHandler = Completed;
    }
    private void button1_Click( object sender, EventArgs e )
    {
        // this is based on an app requirement, seems odd but I'm sure there's a reason :)
        Thread thread = new Thread( AddTasks );
        thread.Start();
    }
    private void AddTasks()
    {
        for ( int i = 0; i < 5; i++ )
        {
            AddToLog( "Creating Task " + i );
            SingletonWorker.AddTask( new Task { NumberToWorkOn = i } );
        }
    }
    private void AddToLog( string message )
    {
        if( textBox1.InvokeRequired )
        {
            textBox1.Invoke( new Action<string>( AddToLog ), message );
            return;
        }
        textBox1.Text += DateTime.Now + "   " + message + System.Environment.NewLine;
        textBox1.Select( textBox1.Text.Length, 0 );
        textBox1.ScrollToCaret();
    }
    private void Progress( string message, int percentComplete )
    {
        AddToLog( String.Format( "{0}%, {1}", percentComplete, message ) );
    }
    private void Completed( string message )
    {
        AddToLog( message );
    }
}
public class Task
{
    public int NumberToWorkOn { get; set; }
}
public static class SingletonWorker
{
    private static readonly Thread Worker;
    private static readonly Queue<Task> Tasks;
    // assume params are 'message' and 'percent complete'
    // also assume only one listener, otherwise use events
    public static Action<string, int> ProgressHandler;
    public static Action<string> CompleteHandler;
    static SingletonWorker()
    {
        Worker = new Thread( Start );
        Tasks = new Queue<Task>();
        Worker.Start();
    }
    private static Task GetNextTask()
    {
        lock( Tasks )
        {
            if ( Tasks.Count > 0 )
                return Tasks.Dequeue();

            return null;
        }
    }
    public static void AddTask( Task task )
    {
        lock( Tasks )
        {
            Tasks.Enqueue( task );
        }
    }
    private static void Start()
    {
        while( true )
        {
            Task task = GetNextTask();
            if( task == null )
            {
                // sleep for 500ms waiting for another item to be enqueued
                Thread.Sleep( 500 );
            }
            else
            {
                // work on it
                ProgressHandler( "Starting on " + task.NumberToWorkOn, 0 );
                Thread.Sleep( 1000 );
                ProgressHandler( "Almost done with " + task.NumberToWorkOn, 50 );
                Thread.Sleep( 1000 );
                CompleteHandler( "Finished with " + task.NumberToWorkOn );
            }
        }
    }
}
Timothy Walters
Thanks. "The downside is that if your creating code is blocking or in a tight loop your callbacks are queued up" this is the part that's confusing me though. The thread that created the BackgroundWorker objects (RunMe) has already finished executing, yet it still has a large delay before it processes the events being returned. Maybe as you say I should just give up on the BackgroundWorker class and write my own asynchronous code...
Andrew
Andrew, I should probably clarify that everything except the "DoWork()" call will be on the creating thread (your UI thread in this case). Have you tried putting that loop directly into the OnClick handler? Then you can get rid of all the InvokeRequired checks since the Progress and Complete methods will be invoked onto the UI thread for you. You're causing some double-hops by adding a thread around the creation of the workers.
Timothy Walters
I've coded up a solution that still uses BackgroundWorker, but the internet filter here at work is being a pain so I'll have to send it from home, the short version is don't create a thread, don't use invoke, as BackgroundWorker handles all that for you, also think carefully about if you really need that lock() statement in DoWork().
Timothy Walters
Unfortunately the thread that generates the BackgroundWorkers is necessary - this is replicating what my main application does, and it must generate these workers from the separate thread. As for the lock, again, it is definitely required - the code in DoWork must be serialised. I tried writing my own BackgroundWorker class which does almost everything the same, with the exception that the events are processed on the BackgroundWorker thread (not ideal, but not critical). This works without any delays so I can use it, but I'm still curious about the issue with BackgroundWorker.
Andrew
BackgroundWorker is no good for what you need, as it's designed for tasks that need to interact with the UI, and relies on being created by the UI thread (unless you could Invoke back to the UI to create each BackgroundWorker?).ThreadPool isn't recommended for code that blocks, so avoid that too.Try using a Queue to store the task information and your own 'worker' thread, it simply does it's task and looks for another when done, sleeping for 500ms or whatever till another task is ready if needed.
Timothy Walters
Yeah, I'm beginning to think that. I was going to start looking into having a thread that would just use a queue and process the requests on its own, rather than having so many threads being created when really they're not being used as independent threads. One of the things I'm looking into now is getting that thread to notify the caller threads when its finished processing (or during to replicate ReportProgress). I'm trying to decide whether it's better to work on inter-thread communication or just use something like ThreadPool.QueueUserWorkItem for handling the results. Thoughts?
Andrew