views:

235

answers:

1

Why does this code not parallelize std::for_each() when it works perfectly fine with std::sort()?

How do I fix it?

g++ -fopenmp -D_GLIBCXX_PARALLEL=1 -o p p.cc && time ./p  sort

GCC 4.3 on Linux.

#include <cstdio>
#include <algorithm>
#include <vector>
#include <cstring>

void delay()
{
        for(int c = 0; c < 1000000; c++) {
    }
}

int lt(int a, int b)
{
        delay();
        return a < b;
}

void noop(int a)
{
    delay();
}

int main(int argc, char **argv)
{
        if (argc < 2) {
                printf("%s  <sort | for_each>\n", argv[0]);
                return 1;
    }

        std::vector<int> foo(10000);

        if (!strcmp(argv[1], "sort")) {
        std::sort(foo.begin(), foo.end(), lt);
    } else if (!strcmp(argv[1], "for_each")) {
                std::for_each(foo.begin(), foo.end(), noop);
    }
}
+3  A: 

Just compiling with -D_GLIBCXX_PARALLEL does not necessarily parallelize all algorithms (see here):

Please note that this doesn't necessarily mean that everything will end up being executed in a parallel manner, but rather that the heuristics and settings coded into the parallel versions will be used to determine if all, some, or no algorithms will be executed using parallel variants.

However the Configuration and tuning Chapter might help you to force parallelization.

Just a note to your "Benchmark": std::sort and std::for_each won't necessarily call delay() the same number of times. std::for_each calls the delay method for N times, std::sort calls it for something between N log(N) and N^2 times (see reference). Thus measuring the execution time gives you only a weak indication about parallelization.

Mr. Mr.
I know about the Big-Oh, it was just so that it would be slow enough to see with "top" and "time".
Thomas
Your config and tuning link helped. This makes for_each() parallel:std::for_each(foo.begin(), foo.end(), noop, _gnu_parallel::parallel_balanced);
Thomas