It is safe to read a STL container from multiple parallel threads. However, the performance is terrible. Why?
I create a small object that stores some data in a multiset. This makes the constructors fairly expensive ( about 5 usecs on my machine. ) I store hundreds of thousands of the small objects in a large multiset. Processing these objects is an independent business, so I split the work between threads running on a multi-core machine. Each thread reads the objects it needs from the large multiset, and processes them.
The problem is that the reading from the big multiset does not proceed in parallel. It looks like the reads in one thread block the reads in the other.
The code below is the simplest I can make it and still show the problem. First it creates a large multiset containing 100,000 small objects each containing its own empty multiset. Then it calls the multiset copy constructor twice in series, then twice again in parallel.
A profiling tool shows that the serial copy constructors take about 0.23 secs, whereas the parallel ones take twice as long. Somehow the parallel copies are interfering with each other.
// a trivial class with a significant ctor and ability to populate an associative container
class cTest
{
multiset<int> mine;
int id;
public:
cTest( int i ) : id( i ) {}
bool operator<(const cTest& o) const { return id < o.id; }
};
// add 100,000 objects to multiset
void Populate( multiset<cTest>& m )
{
for( int k = 0; k < 100000; k++ )
{
m.insert(cTest(k));
}
}
// copy construct multiset, called from mainline
void Copy( const multiset<cTest>& m )
{
cRavenProfile profile("copy_main");
multiset<cTest> copy( m );
}
// copy construct multiset, called from thread
void Copy2( const multiset<cTest>& m )
{
cRavenProfile profile("copy_thread");
multiset<cTest> copy( m );
}
int _tmain(int argc, _TCHAR* argv[])
{
cRavenProfile profile("test");
profile.Start();
multiset<cTest> master;
Populate( master );
// two calls to copy ctor from mainline
Copy( master );
Copy( master );
// call copy ctor in parrallel
boost::thread* pt1 = new boost::thread( boost::bind( Copy2, master ));
boost::thread* pt2 = new boost::thread( boost::bind( Copy2, master ));
pt1->join();
pt2->join();
// display profiler results
cRavenProfile print_profile;
return 0;
}
Here is the output
Scope Calls Mean (secs) Total
copy_thread 2 0.472498 0.944997
copy_main 2 0.233529 0.467058