views:

52

answers:

3

I have a function which is passed two structures by reference. These structures are composed of dynamically allocated arrays. Now when I try to implement OpenMP I'm getting a slowdown not a speedup. I'm thinking this can be attributed to possible sharing issues. Here's some of the code for your perusal (C):

void    leap(MHD *mhd,GRID *grid,short int gchk)
{
  /*-- V A R I A B L E S --*/
  // Indexes
  int i,j,k,tid;
  double rhoinv[grid->nx][grid->ny][grid->nz];
  double rhoiinv[grid->nx][grid->ny][grid->nz];
  double rhoeinv[grid->nx][grid->ny][grid->nz];
  double rhoninv[grid->nx][grid->ny][grid->nz]; // Rho Inversion
  #pragma omp parallel shared(mhd->rho,mhd->rhoi,mhd->rhoe,mhd->rhon,grid,rhoinv,rhoiinv,rhoeinv,rhoninv) \
                       private(i,j,k,tid,stime)
  {
    tid=omp_get_thread_num();
    printf("-----  Thread %d Checking in!\n",tid);
    #pragma omp barrier
    if (tid == 0)
    {
      stime=clock();
      printf("-----1) Calculating leap helpers");
    }
    #pragma omp for
    for(i=0;i<grid->nx;i++)
    {
      for(j=0;j<grid->ny;j++)
      {
        for(k=0;k<grid->nz;k++)
        {
          //      rho's
          rhoinv[i][j][k]=1./mhd->rho[i][j][k];
          rhoiinv[i][j][k]=1./mhd->rhoi[i][j][k];
          rhoeinv[i][j][k]=1./mhd->rhoe[i][j][k];
          rhoninv[i][j][k]=1./mhd->rhon[i][j][k];
        }
      }
    }
    if (tid == 0)
    {
      printf("........%04.2f [s] -----\n",(clock()-stime)/CLOCKS_PER_SEC);
      stime=clock();
    }
    #pragma omp barrier
  }/*-- End Parallel Region --*/
}

Now I've tried default(shared) and shared(mhd) but neither show any signs of improvement. Could it be that since the arrays are allocated

mhd->rho=(double ***)newarray(nx,ny,nz,sizeof(double));

That by declaring the structure or the pointer to the element of the structure that I'm not actually sharing the memory just the pointers to it? Oh and nx=389 ny=7 and nz=739 in this example. Execution time for this section in serial is 0.23 [s] and 0.79 [s] for 8 threads.

A: 

An important point here could be your upper bound of your loops. Since you use grid->nz etc openMP can't know if they will change or not for each iteration. Load these values in local variables and use these for the loop condition.

Jens Gustedt
I tried doing just that...no dice same execution time. I also tried putting all data in local arrays (rhotemp) and accessing them instead of the values in the structure same effect.
Lazer