I'm looking to use QtConcurrent with a program having two or three levels of possible parallelism. At the highest level, a function L1() is called several times. The function L1() calls a function L2() several times, and the function L2() calls a function L3() several times. The bulk of the total running time is spent in L3().
The execution time for L3() is expected to be from a few seconds up to one minute for typical input. The execution time L1() will be tens of minutes. Is performance likely to benefit from calling map() with L3() as the map function? Is there a rule of thumb of the form "if the execution time for each of x parallelizable units is longer than y, then parallelize those units"?
To parallelize at the lowest level using, say, QtConcurrent::map(), there are a few possible approaches. With some preprocessing, I could generate all the data that would ever be passed to L3() and then just call map() with L3() as the map function. Alternatively, I could call map() with L1() as the map function, having modified L1() to call map() with L3(). Are these approaches likely to differ in performance? How well does QtConcurrent handle several subsequent map() calls and "nested" map() calls?