views:

98

answers:

4

In C#,

How would one go about spawning multiple threads and then sequentially adding results to a list before returning the entire result set?

What are some best practices?

I'm so far using an ManualResetEvent to signal when the last element has been processed by a thread.

But when it returns, I need to have them consolidate the result sets in sequential order so that we don't get into contention issues with the return value list (total results).

+1  A: 

If you know the final order before you spawn the threads (which your "sequentially" implies), you could pass an index into each thread, and have it write its results into that "slot" in an array. Thus whan all threads have completed processing (in any order), the results will already be ordered correctly, avoiding the need for a post-processing sort entirely.

Jason Williams
You are still going to run in contention issues with getting reference to the array for the slot.
LB
@LB - there is no contention if each thread has a dedicated slot. The array itself (as opposed to the items it references) is created before the threads run, and destroyed afterwards, and is immutable during the processing. The items are created by the threads, added to the array, and then used (after processing is complete) by the 'main' thread, so again, there is no contention. Obviously there are better ways in .net 4... but the OP specifies .net 3.5
Jason Williams
+1  A: 

The Task Parallel Library which is now part of the Reactive Extensions for .NET Framework makes stuff like this trivial. There's a set of Parallel constructs for parallelizing your code, and a set of thread-safe Concurrent{Container}s which you can use with them.

Here's an example of squaring a bunch of numbers, using a Parallel.For and a ConcurrentBag to store the results.

using System.Threading.Tasks;
using System.Collections.Concurrent;

namespace ParallelTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var results = new ConcurrentBag<int>();
            Parallel.For(0, 10, i =>
            {
                results.Add(i * i);
            });
            foreach (int i in results)
                System.Console.WriteLine(i);
        }
    }
}

The ConcurrentBag is a regular IEnumerable, as you can see I'm using a regular, non-parallel foreach to print out the results at the end.

Note: All this stuff is actually standard in .NET 4.0, you just need Rx if you want it for .NET 3.5.

tzaman
Just about to suggest the TPL too, but I would argue that it's now part of the .NET framework NOT the **Rx** extensions, it was however part of the Parallel **FX** extensions ;)
ntziolis
@ntziolis: If you want it for 3.5, you need the Rx extensions. It includes a backport of TPL and PLINQ.
Reed Copsey
Too bad I cannot use RX.
LB
@Reed - Good to know, until now (well and then again the last project in 3.5 is already quite a while ago) I have used the CTP of the TPL, I didn't know that the Rx TPL is actually a .NET 4 backport, thx for the tip!
ntziolis
+1  A: 

If you're using .Net 4 you can for example use the Task class. Here's an example merging List

Task<List<string>> task1 = new Task<List<string>>(SomeFunction);
Task<List<string>> task2 = new Task<List<string>>(SomeFunction);
task1.Start();
task2.Start();

var taskList = new List<Task<List<string>>> {task1, task2};

Task.WaitAll(taskList.ToArray());

List<string> res = new List<string>();
foreach (Task<List<string>> t in taskList)
{
    res.AddRange(t.Result);
}

and your function

List<string> SomeFunction()
{
    return new List<string>{"1","2"};
}
Mikael Svenson
A: 

As you launch each thread, pass it your sequence identifier and a callback method, and increment a counter that indicates the total number of running threads. When each thread finishes, it invokes the callback method, which decrements the running thread count and inserts the results into a SortedDictionary, keyed by the sequence id. When the last thread is finished, the callback can signal your main routine.

ebpower