views:

71

answers:

2

Code:

#include <iostream>
#include "stdafx.h"
#include <boost/thread.hpp>
#include <boost/thread/mutex.hpp>

using namespace std;
boost::mutex mut;
double results[10];

void doubler(int x) {
//boost::mutex::scoped_lock lck(mut);
 results[x] = x*2;
}

int _tmain(int argc, _TCHAR* argv[])
{
 boost::thread_group thds;
 for (int x = 10; x>0; x--) {
  boost::thread *Thread = new boost::thread(&doubler, x);
  thds.add_thread(Thread);
 }

 thds.join_all();

 for (int x = 0; x<10; x++) {
  cout << results[x] << endl;
 }

 return 0;
}

Output:

0
2
4
6
8
10
12
14
16
18
Press any key to continue . . .

So...my question is why does this work(as far as i can tell, i ran it about 20 times), producing the above output, even with the locking commented out? I thought the general idea was:

in each thread:
calculate 2*x
copy results to CPU register(s)
store calculation in correct part of array
copy results back to main(shared) memory

I would think that under all but perfect conditions this would result in some part of the results array having 0 values. Is it only copying the required double of the array to a cpu register? Or is it just too short of a calculation to get preempted before it writes the result back to ram? Thanks.

+1  A: 

It works because of the thds.join_all(); line. The main execution thread traps here until all the other threads are completed, then continues to print out the array. Therefore you know that all the array values have been stored before you print them. If you comment out this line you will get unpredictable results.

Jason Coco
So is that causing all the threads to execute sequentially? I thought join all just required all threads to have terminated before it proceeds, regardless of what they did with the results array while they were all executing. I still don't see why there isn't a race condition.
Flamewires
@Flamewires: there is no race condition because the main thread explicitly waits for the other worker threads to "catch up", thereby eliminating the race condition. Also, since each worker thread only updates a specifically assigned location in memory, there is no need to worry about data. So, all worker thread do their work (assign a value to their pre-defined memory location) while the main thread waits. When all the workers are done, the main thread continues. So the workers work in parallel, but the join_all() call creates a sequence point ensuring that all the data is in a good state.
Jason Coco
Yeah, i understand that but what if each one was assigned, say, a bit in an integer? instead of a double in an array? Do you see what im getting at? Surely a race condition would exist because to access that bit, the whole int would be copied, updated, then replaced by X number of threads at the same time. Whats the lower limit on size of a memory location i can designate and not cause a race condition? a word?
Flamewires
@Flamewires: With the bits you would need some kind of locking and it would force all the threads to act serially. Your lowest-case would be whatever atomic operations you can perform. Here, you would need to do an atomic read and set operation or (more horribly) use some sort of mutex.
Jason Coco
Okay. Thanks. I think I got the answer I was looking for.
Flamewires
+2  A: 

The assignment has an lvalue of type double on the left and that lvalue is the only object being accessed by a thread. Since each thread accesses a different object, there is no data race.

Note that subscripting an array does not constitute an access.

avakar
@Flamewires: Complicated data structures are fine as long as each thread only acts on their own structure (like in the array). If you flip bits on an integer you will need to either lock or ensure that you have an atomic read and set. In this simple case, that would force all the thread to operate serially with high contention for the lock, making it a very bad candidate for parallelization.
Jason Coco
Yes, if you replaced double with some other type, it would still be fine. There is no minimum size as far as the C++0x is concerned (it is up to the implementation to make this work on all platforms). If you start fiddling with bits, you get a data race (and therefore undefined behavior), because you would be accessing the same object from multiple threads without synchronization.
avakar
Ah okay, that helps a bit.
Flamewires
Well... you say theres no minimum size, but, correct me if I'm wrong, I couldn't just allocate 100 bits in an array and have them be separate objects, so would my size limit be the size of the smallest c++ primitive?
Flamewires
or would it be whatever my computer defines as a WORD?
Flamewires
Well, a bit is not a c++ object and you can't really "allocate 100 bits". The smallest object is of type `char` and if you change the type from double to char and let each thread access different char, you'll be race-free.
avakar