I have a small C++ program using OpenMP. It works fine on Windows7, Core i7 with VisualStudio 2010. On an iMac with a Core i7 and g++ v4.2.1, the code runs much more slowly using 4 threads than it does with just one. The same 'slower' behavior is exihibited on 2 other Red Hat machines using g++. Here is the code:
int iHundredMillion = 100000000;
int iNumWorkers = 4;
std::vector<Worker*> workers;
for(int i=0; i<iNumWorkers; ++i)
{
Worker * pWorker = new Worker();
workers.push_back(pWorker);
}
int iThr;
#pragma omp parallel for private (iThr) // Parallel run
for(int k=0; k<iNumWorkers; ++k)
{
iThr = omp_get_thread_num();
workers[k]->Run( (3)*iHundredMillion, iThr );
}
I'm compiling with g++ like this:
g++ -fopenmp -O2 -o a.out *.cpp
Can anyone tell me what silly mistake I'm making on the *nix platform?