views:

216

answers:

5

I have an application that will be spawning multiple threads. However, I feel there might be an issue with threads accessing data that they shouldn't be. I'm relatively new to threading so bare with me.

Here is the structure of the threaded application (sorry for the crudeness):

                   MainThread
                  /          \
                 /            \
                /              \
        Thread A               Thread B
       /        \              /       \
      /          \            /         \
     /            \          /           \
Thread A_1     Thread A_2   Thread B_1    Thread B_2

Under each lettered thread (which could be many), there will only be two threads and they are fired of sequentially. The issue i'm having is I'm not entirely sure how to pass in a datastructure into these threads.

So, the datastructure is created in MainThread, will be modified in the lettered thread (Thread A, etc) specific to that thread and then a member variable from that datastructure is sent to Letter_Numbered threads.

Currently, the lettered thread class has a member variable and when the class is constructed, the datastructure from mainthread is passed in by reference, invoking the copy constructor so the lettered thread has it's own copy to play with.

The lettered_numbered thread simply takes in a string variable from the data structure within the lettered thread. My question is, is this accceptable? Is there a much better way to ensure each lettered thread gets its own data structure to play with?

Sorry for the somewhat poor explanation, please leave comments and i'll try to clarify.

EDIT: So my lettered thread constructor should take the VALUE of the data structure, not the reference?

+4  A: 

I would have each thread create it's own copy of the datastructure, e.g. you pass the structure in the constructor and then explicitly create a local copy. Then you are guaranteed that the threads have distinct copies. (You say that it's passsed by reference, and that this invokes the copy constructor. I think you mean pass by value? I feel it's better to explicitly make a copy, to leave no doubt and to make your intent clear. Otherwise someone might later come along and change your pass by value to pass by reference as a "smart optimization".)

EDIT: Removed comment about strings. For some reason, I was assuming .NET.

To ensure strings are privately owned, follow the same procedure, create a copy of the string, which you can then freely modify.

mdma
Hmm, well we do pass by reference but it does invoke the copy constructor and seems to assign the member variable it's own copy. I haven't completely verified it though, might do that today.
Robb
Strings in C++ are not immutable.
stonemetal
Also, make sure it's a deep copy to ensure that all the data is thread local. If you can get away with not sharing anything between threads, then that's the safest and easiest way to do it.
Tom Dalling
Brainbrup - I was thinking of System.String. I've edited my response accordingly.
mdma
@Tom: Yes it is a deep copy, that is for sure.
Robb
Downvoters - please read my edited comment. I'm not talking about C++ strings being immutable, but System.String, which is immutable.
mdma
I'd just remove the System.String altogether, it's not part of C++.
GMan
+1  A: 

Have you looked at boost threads?

You would basically create a callable class that has a constructor that takes the parameters the thread is to work on and then launch the thread by passing objects of your callable class, initialized and ready to go.

This is very similar to how Java implements threads and it makes a good amount of sense most of the time from a design point of view.

sechastain
As much as I'd like to go the boost route, the project is currently in a state where that would require enough of a re-write where i'd be shot down by superiors.
Robb
well, even if boost isn't available - assuming you're comfortable launching threads via a member method of an object, the pattern still applies:create a thread class that has: - a constructor that requires all the data that the thread will need to work on - a start/run method that acts as the thread entry point and processing loopSo you create your instances of the thread objects and then start them by passing the objects' start/run/whatever method as the thread entry point.
sechastain
Yes, this is actually what I do, except using the pass by reference of the data structure.
Robb
+3  A: 

There is a pattern called Active Object Pattern wherein each object executes in its own thread. Frameworks like ACE support this. If you have access to such frameworks, you should use those. In any case, i would believe creating a new instance of an object and allowing it to exetute in its own thread is much cleaner that invoking the copy-constructor to make a copy of the object. Else see if you can fit a solution that uses Thread Local Storage.

Abhay
+1  A: 

You aparently are making a copy of the data for each trhead and everything works? then no problem.

Here are some additional thoughts:

  • If data is read only, you can share a single struct and everything will be ok, as long as each read is small and fast (basic types)
  • If data needs to be written, but "private" (or contained) to each thread, then send a copy to each thread (what you are doing). Caveat: I assume the data is not too big and a copy does not eat to much resources.
  • If the data needs to be written and the new values shared between threads, then you need to think about it (read on it) and create a proper design. I like a transactional object to centralize each threads read/write operation. Like a tiny database in memory. Check on thread mutex, semaphores and critical sections). Dealing with huge data set I have used a database to centralize requests (See ODBM). You can also check existing messaging queuing libraries (like MSMQ) to have data change ordered and synchronized.

Hope this helps.

The GG
Luckily the data struct won't really be shared between the threads, other than described above. So if the data structure is contained within each thread (and each thread has its OWN data struct) there shouldn't be much use for mutexes.
Robb
A: 

It seems unlikely that you would want each thread to operate on the data and then not at least occasionally have another thread react to what another thread has done to another thread's work on the data. If you are truly independent meaning that no other thread truly will ever care about work that another thread has done, then I suggest making a copy of the data, otherwise in the case where you will want to do work in one thread and make that result of that work available to another thread I would suggest that you, pass a reference/pointer to the object around and then protect access to it via locks so that the threads can work with it, properly, I suggest a multi-read, single writer lock implementation.

NSA