tags:

views:

152

answers:

1

Hi all,

Sorry I cannot post any source code...

I have a code running a master/slave red-black algorithm for a G.S. solver. In the simple case, the matrix is split into 4 evenly sized computational pieces. The images 1-3 perform their part of the computation, and send back buffers with the results to image 0. The problem is this:

I have malloc’d an array large enough to hold all of the pieces so that I can map the individual results back into a single grid. The problem seem sto be that on image 0, after the MPI_Recv call, that process is no longer aware that the grid was malloc’d for holding the whole thing. I get an error any time I try to put something in that grid. The only workaround I have found that works is to perform a malloc on all processes, and a malloc again for process zero right before the MPI_Recv.

Any ideas why it is seemingly losing the reference to that previously allocated memory?

In psuedocode:

Malloc whole[][] Have to have this allocated
Malloc partial[]
Perform compute on whole[]

If(image!= 0) MPI_Send(whole[])
Else (if image==0) 
  Malloc whole[][] again!  and this allocated otherwise the problem happens
  Loop over other images
  MPI_Recv(partial)
  Put partial[] into whole[][] Here is where the problem occurs
Endif

Thanks for the help in advance

+2  A: 

very very unlikely it is a library bug. Without seen source code is nearly impossible to find problem. My guess you do not allocate correctly or you overwrite allocated pointers with some junk. To test it out, print pointer values right after allocation and right before/after mpi receive.

aaa
I tried what you said, printing out the pointer values after the malloc, right before the MPI_Recv and right after the MPI_Recv. As you suspected, the pointer is being blown away right after the Recv. This particular Recv is only on image 0, and is recieving a type I defined called matrix, which I defined to hold a 2D array. Do you think this could have something to do with it? Are there some difference between sending a buffer of MPI_DOUBLE vs. a MPI_Datatype?
Derek