views:

112

answers:

5

I have a scenario where, at certain points in my program, a thread needs to update several shared data structures. Each data structure can be safely updated in parallel with any other data structure, but each data structure can only be updated by one thread at a time. The simple, naive way I've expressed this in my code is:

synchronized updateStructure1();
synchronized updateStructure2();
// ...

This seems inefficient because if multiple threads are trying to update structure 1, but no thread is trying to update structure 2, they'll all block waiting for the lock that protects structure 1, while the lock for structure 2 sits untaken.

Is there a "standard" way of remedying this? In other words, is there a standard threading primitive that tries to update all structures in a round-robin fashion, blocks only if all locks are taken, and returns when all structures are updated?

This is a somewhat language agnostic question, but in case it helps, the language I'm using is D.

+1  A: 

I don't know if there's a standard way to do this. However, I would implement this something like the following:

do
{
    if (!updatedA && mutexA.tryLock())
    {
        scope(exit) mutexA.unlock();
        updateA();
        updatedA = true;
    }

    if (!updatedB && mutexB.tryLock())
    {
        scope(exit) mutexB.unlock();
        updateB();
        updatedB = true;
    }
}
while (!(updatedA && updatedB));

Some clever metaprogramming could probably cut down the repetition, but I leave that as an exercise for you.

JSBangs
Interesting idea. One problem though is that this cause threads to spin while waiting for a lock. It's too bad more platforms didn't support something like Windows `WaitForMultipleObjects`
R Samuel Klatchko
@RSamuel, you could always add a brief sleep to the bottom of the loop.
JSBangs
A: 

To my knowledge, there is not a standard way to accomplish this, and you'll have to get your hands dirty.

To paraphrase your requirements, you have a set of data structures, and you need to do work on them, but not in any particular order. You only want to block waiting on a data structure if all other objects are blocked. Here's the pseudocode I would base my solution on:

work = unshared list of objects that need updating
while work is not empty:
    found = false
    for each obj in work:
        try locking obj
        if successful:
            remove obj from work
            found = true
            obj.update()
            unlock obj
    if !found:
        // Everything is locked, so we have to wait
        obj = randomly pick an object from work
        remove obj from work
        lock obj
        obj.update()
        unlock obj

An updating thread will only block if it finds that all objects it needs to use are locked. Then it must wait on something, so it just picks one and locks it. Ideally, it would pick the object that will be unlocked earliest, but there's no simple way of telling that.

Also, it's conceivable that an object might become free while the updater is in the try loop and so the updater would skip it. But if the amount of work you're doing is large enough, relative to the cost of iterating through that loop, the false conflict should be rare, and it would only matter in cases of extremely high contention.

Karmastan
+2  A: 

If your language supported lightweight threads or Actors, you could always have the updating thread spawn a new a new thread to change each object, where each thread just locks, modifies, and unlocks each object. Then have your updating thread join on all its child threads before returning. This punts the problem to the runtime's schedule, and it's free to schedule those child threads any way it can for best performance.

You could do this in langauges with heavier threads, but the spawn and join might have too much overhead (though thread pooling might mitigate some of this).

Karmastan
+1, excellent idea for cases where the updates take a long time. In this case, the updates are very quick, so even the overhead of submitting the job to a task pool would outweigh the benefit, but it's still an excellent idea for future reference.
dsimcha
A: 

I don't know any "standard" way of doing this, sorry. So this below is just a ThreadGroup, abstracted by a Swarm-class, that »hacks» at a job list until all are done, round-robin style, and makes sure that as many threads as possible are used. I don't know how to do this without a job list.

Disclaimer: I'm very new to D, and concurrency programming, so the code is rather amateurish. I saw this more as a fun exercise. (I'm too dealing with some concurrency stuff.) I also understand that this isn't quite what you're looking for. If anyone has any pointers I'd love to hear them!

import  core.thread,
        core.sync.mutex,
        std.c.stdio,
        std.stdio;

class Swarm{
    ThreadGroup group;
    Mutex mutex;
    auto numThreads = 1;
    void delegate ()[int] jobs;
    this(void delegate()[int] aJobs, int aNumThreads){
        jobs = aJobs;
        numThreads = aNumThreads;
        group = new ThreadGroup;
        mutex = new Mutex();
    }
    void runBlocking(){
        run();
        group.joinAll();
    }
    void run(){
        foreach(c;0..numThreads)
            group.create( &swarmJobs );
    }
    void swarmJobs(){
        void delegate () myJob;
        do{
            myJob = null;
            synchronized(mutex){
                if(jobs.length > 0)
                    foreach(i,job;jobs){
                        myJob = job;
                        jobs.remove(i);
                        break;
                    }
            }
            if(myJob)
                myJob();
        }while(myJob)
    }
}
class Jobs{
    void job1(){
        foreach(c;0..1000){
            foreach(j;0..2_000_000){}
            writef("1");
            fflush(core.stdc.stdio.stdout);
        }
    }
    void job2(){
        foreach(c;0..1000){
            foreach(j;0..1_000_000){}
            writef("2");
            fflush(core.stdc.stdio.stdout);
        }
    }
}
void main(){
    auto jobs = new Jobs();
    void delegate ()[int] jobsList = 
         [1:&jobs.job1,2:&jobs.job2,3:&jobs.job1,4:&jobs.job2];
    int numThreads = 2;
    auto swarm = new Swarm(jobsList,numThreads);
    swarm.runBlocking();
    writefln("end");
}
0scar
A: 

There's no standard solution but rather a class of standard solutions depending on your needs.

http://en.wikipedia.org/wiki/Scheduling_algorithm

Conrad Frix