tags:

views:

76

answers:

2

Hi,

I have a scenario like:

for (i = 0; i < n; i++)  
{  
  for (j = 0; j < m; j++)  
{  
  for (k = 0; k < x; k++)  
  {  
     val = 2*i + j + 4*k  
     if (val != 0)  
     {  
      for(t = 0; t < l; t++)  
         {  
           someFunction((i + t) + someFunction(j + t) + k*t)  
         }  
      }  
   }  
}  
}

Considering this is block A, Now I have two more similar blocks in my code. I want to put them in parallel, so I used OpenMP pragmas. However I am not able to parallelize it, because I am a tad confused that which variables would be shared and private in this case. If the function call in the inner loop was an operation like sum += x, then I could have added a reduction clause. In general, how would one approach parallelizing a code using OpenMP, when we there is a nested for loop, and then another inner for loop doing the main operation. I tried declaring a parallel region, and then simply putting pragma fors before the blocks, but definitely I am missing a point there!

Thanks, Sayan

+1  A: 

I'm more of a Fortran programmer than C so my knowledge of OpenMP in C-style is poor, and I'll leave the syntax to you.

Your easiest approach here is probably (I'll qualify this later) to simply parallelise the outermost loop. By default OpenMP will regard variable i as private, all the rest as shared. This is probably not what you want, you probably want to make j and k and t private too. I suspect that you want val private also.

I'm a bit puzzled by the statement at the bottom of your nest of loops (ie someFunction...), which doesn't seem to return any value at all. Does it work by side-effects ?

So, you shouldn't need to declare a parallel region enclosing all this code, and you should probably only parallelise the outermost loop. If you were to parallelise the inner loops too you might find your OpenMP installation either ignoring them, spawning more processes than you have processors, or complaining bitterly.

I say that your easiest approach is probably to parallelise the outermost loop because I've made some assumptions about what your program (fragment) is doing. If the assumptions are wrong you might want to parallelise one of the inner loops. Another point to check is that the number of executions of the loop(s) you parallelise is much greater than the number of threads you use. You don't want to have OpenMP run loops with a trip count of, say, 7, on 4 threads, the load balance would be very poor.

High Performance Mark
A: 

You're correct, the innermost statement would rather be someFunction((i + t) + someFunction2(j + t) + k*t).

Sayan Ghosh
That's still not a statement which obviously returns a value. Side effects ? Which are probably a very bad idea in a parallel program.
High Performance Mark
Actually that writes into an array. The concept is that there is a big array where a position x,y gets written with a value x',y' (some other location of the array, already updated OR 0) multiplied by some number. The program wasn't written with parallelism on mind, so, you know! Thanks.
Sayan Ghosh