views:

67

answers:

2

I just started playing with the Task Parallel Library, and ran into interesting issues; I have a general idea of what is going on, but would like to hear comments from people more competent than me to help understand what is happening. My apologies for the somewhat lengthy code.

I started with a non-parallel simulation of a random walk:

 var random = new Random();
 Stopwatch stopwatch = new Stopwatch();

 stopwatch.Start();

 var simulations = new List<int>();
 for (var run = 0; run < 20; run++)
 {
    var position = 0;
    for (var step = 0; step < 10000000; step++)
    {
       if (random.Next(0, 2) == 0)
       {
          position--;
       }
       else
       {
          position++;
       }
    }

    Console.WriteLine(string.Format("Terminated run {0} at position {1}.", run, position));
    simulations.Add(position);
 }

 Console.WriteLine(string.Format("Average position: {0} .", simulations.Average()));
 stopwatch.Stop();

 Console.WriteLine(string.Format("Time elapsed: {0}", stopwatch.ElapsedMilliseconds));
 Console.ReadLine();

I then wrote my first attempt at a parallel loop:

 var localRandom = new Random();

 stopwatch.Reset();
 stopwatch.Start();

 var parallelSimulations = new List<int>();
 Parallel.For(0, 20, run =>
 {
    var position = 0;
    for (var step = 0; step < 10000000; step++)
    {
       if (localRandom.Next(0, 2) == 0)
       {
          position--;
       }
       else
       {
          position++;
       }
    }

    Console.WriteLine(string.Format("Terminated run {0} at position {1}.", run, position));
    parallelSimulations.Add(position);
 });


 Console.WriteLine(string.Format("Average position: {0} .", parallelSimulations.Average()));
 stopwatch.Stop();

 Console.WriteLine(string.Format("Time elapsed: {0}", stopwatch.ElapsedMilliseconds));

 Console.ReadLine();

When I ran it on a virtual machine set to use 1 core only, I observed a similar duration, but the runs are no longer processed in order - no surprise.

When I ran it on a dual-core machine, things went odd. I saw no improvement in time, and observed some very weird results for each run. Most runs end up with results of -1,000,000, (or very close), which indicates that Random.Next is returning 0 quasi all the time.

When I make the random local to each loop, everything works just fine, and I get the expected duration improvement:

Parallel.For(0, 20, run =>
         {
            var localRandom = new Random();
            var position = 0; 

My guess is that the problem has to do with the fact that the Random object is shared between the loops, and has some state. The lack of improvement in duration in the "failing parallel" version is I assume due to that fact that the calls to Random are not processed in parallel (even though I see that the parallel version uses both cores, whereas the original doesn't). The piece I really don't get is why the simulation results are what they are.

One separate worry I have is that if I use Random instances local to each loop, I may run into the problem of having multiple loops starting with the same seed (the issue you get when you generate multiple Randoms too close in time, resulting in identical sequences).

Any insight in what is going on would be very valuable to me!

+2  A: 

The Random class is not thread-safe; if you use it on multiple threads, it can get messed up.

You should make a separate Random instance on each thread, and make sure that they don't end up using the same seed. (eg, Environment.TickCount * Thread.CurrentThread.ManagedThreadId)

SLaks
How would you go about the seed issue?
Mathias
Event if you go with this approach, which has problems, I would not use TickCount * ManageThreadId as this will produce seeds which are very close together. See my answer below for a better way of generating seeds.
Ade Miller
+2  A: 

Neither of these approaches will give you really good random numbers.

This blog post covers a lot of approaches for getting better random numbers with Random

http://blogs.msdn.com/b/pfxteam/archive/2009/02/19/9434171.aspx

These may be fine for many day to day applications.

However if you use the same random number generator on multiple threads even with different seeds you will still impact the quality of your random numbers. This is because you are generating sequences of pseudo-random numbers which may overlap.

This video explains why in a bit more detail:

http://software.intel.com/en-us/videos/tim-mattson-use-and-abuse-of-random-numbers/

If you want really random numbers then you really need to use the crypto random number generator System.Security.Cryptography.RNGCryptoServiceProvider. This is threadsafe.

Ade Miller
Ade, thank you for the pointer to the S. Toub article, it is excellent.
Mathias

related questions