views:

262

answers:

3

I have an IronPython script that uses the TPL and Parallel.ForEach to process files using multiple threads. In C# I can use Interlocked.Add and Interlocked.Increment to change global variables in an atomic thread-safe operation, but this does not work in IronPython because integers are immutable. I currently have a simple Results class that stores a few variables as static members that are used to keep track of the results from a multi-threaded operation. When changing multiple values I can lock the class using the .NET Monitor class to ensure that the update is thread-safe, but this seems like a lot of overhead if I only want to update a single variable (like just increment Results.Files).

My question is if there is a better way to increment a single static member variable like Results.Files in IronPython in a thread-safe or atomic way similar to how Interlocked.Increment works? Alternatively are there any thread-safe counters built into python or the .NET framework that could be used instead of a basic integer?

class Results:
    Files = 0
    Lines = 0
    Tolkens = 0 

    @staticmethod
    def Add(intFiles, intLines, intTolkens): 
        #Use the System.Threading.Monitor class to ensure addition is thread safe
        Monitor.Enter(Results) 
        Results.Files += intFiles
        Results.Lines += intLines
        Results.Tolkens += intTolkens
        Monitor.Exit(Results) #Finish thread safe code
A: 

In the past, I have divided work up for parallel processing, stored the results with the unit of work, and the collated them in the end. Think Map/Reduce and you have it.

Create a new thread that gobbles up your tuples as they come in (or wait until everything is done). This incremental or complete summation method should be called at the end, or be the only thing that reads from the queue and increments the counters.

Change the add method to put the results as a tuple in the queue.

Hopefully this helps.

Jacob

TheJacobTaylor
I actually did end up using a list to store intermediate results by calling the thread-safe .append() method and then calling sum() or len() to get the aggregate value (see CallCount in the LogWriter class of the linked script), but that seemed like a huge hack and doesn't scale well for large amounts of data or a large number of variables. There has to be a more pythonic method of doing a simple counter in a thread safe manner.
Greg Bray
A: 

If you are willing to use a bit C#, you can create a simple reusable C# class that encapsulates a (non-static) int member variable and provides the Interlocked functions.

class InterlockedWrapper
{
     private int _value;

     public int Increment()
     {
          return Interlocked.Increment(ref _value);
     }
....

and so on. You can then use this class from Python.

oefe
I thought of that, but when I saw that Interlocked took a ref argument I looked for a way to create a reference in IronPython and found a solution. Thanks for the post though!
Greg Bray
+1  A: 

Well it looks like the python way to do this would be to use a multiprocessing.Value object, which by default will lock the object whenever it is accessed. Saddly the multiprocessing class is not built into IronPython since it is based on CTypes. I did however find a way to do it using the Interlocked class and a reference to a CLR object:

import clr
from System.Threading import Interlocked
refInt = clr.Reference<int>(5) #Create a reference to an integer
#refInt = <System.Int32 object at 0x0000000000000049 [5]>
#refInt.Value = 5
Interlocked.Increment(refInt) #Returns 6 and refInt now points to a new integer
#refInt = <System.Int32 object at 0x000000000000004A [6]>
#refInt.Value = 6

In that case you can use all of the Interlocked methods to add, compare, exchange, increment, and read the refInt object. You can also get or set the refInt.Value directly, but only the Interlocked methods will be thread-safe. Also the Interlocked methods will NOT throw an overflow exception (it will just wrap silently) so make sure you choose a data type that is large enough to never overflow.

Greg Bray
Just to clarify: in the original example you would replace the Files, Lines, and Tolkens variables with refFiles = clr.Reference<int>(0) and refLines = clr.Reference<int>(0) and then use the Interlocked class whenever you want to change the value and refLines.Value whenever you want to access the counter.
Greg Bray