tags:

views:

164

answers:

2

What happens when I MPI_Send to a process that has finished?

I am learning MPI, and writing a small sugar distribution-simulation in C. When the factories stop producing, those processes end. When warehouses run empty, they end. Can I somehow tell if the shop's order to a warehouse did not succeed(because the warehouse process has ended) by looking at the return value of MPI_Send? The documentation doesn't mention a specific error code for this situation, but that no error is returned for success.

Can I do:

if (MPI_Send(...)) {
    ...
    /* destination has ended */
    ...
}

And disregard the error code?

Thanks

A: 

As far as I'm aware, the MPI standard defines no return value and no out parameters for MPI_Send(): it doesn't provide any information on the message sending event, probably because message buffering can make it so that no information is available on the result at the time the call returns.

If you want one process to see when another ends, you should send a message from the finishing process with a designated tag, and peridocally post nonblocking receives at the other process to see if an exit notification was sent.

Or, if you want the entire program to abort when one process stops, the easiest thing to do is to simply call MPI_Abort() in the finishing process with MPI_COMM_WORLD as the communicator, which is guaranteed to shut down all processes.

Edit: To actually answer the question in the title, "What happens when I MPI_Send to a process that has finished?": as I understand it, that depends on whether buffering is used or not. If buffering is not used, then the program will hang. If buffering is used, then MPI_Send() will buffer the message and the process will continue to run, but because no matching receive will ever be posted, the message will never leave the buffer. Doing this a lot will eventually cause the program to run out of memory.

suszterpatt
Thanks, I will still try this. Just as an aside, I think the MPI_Send that returns nothing that you are referring to is the C++ version. According to the documentation at http://www.mcs.anl.gov/research/projects/mpi/www/www3/MPI_Send.html "all MPI routines (except MPI_Wtime and MPI_Wtick) return an error value; C routines as the value of the function".
nieldw
Of those possible return values, MPI_ERR_TAG is the only one I can imagine to be useful here, but only if MPI_Send() goes to all the trouble of checking if a process with the supplied tag in the supplied communicator is actually running. My guess is that it doesn't do that and MPI_ERR_TAG is only returned when the supplied tag is outside the range of valid tag values, but you could easily verify this with a simple test program.
suszterpatt
+1  A: 

Writing code with unmatched MPI_Send calls is not allowed by the standard. Among other things, this means the resulting behavior will be implementation dependent. The range of possible behaviors includes several "obvious" options: exit, hang/deadlock, memory corruption, and so on.

Most implementations have some level of debugging output that could be helpful in tracking down this kind of logical programing error. It is possible to use MPI_Wait* to barrier on the completion of all MPI_Send/MPI_Recv pairs. In a general case, it is not possible to know that the MPI_Send will not be matched until the recv'ing node enters MPI_Finalize. Said another way, a use of a barrier in this condition will cause the program to hang.

In any event, this would be an error condition for MPI_Finalize. The target rank for the MPI_Send should be detected as having exited...so that the MPI_Send can never be matched. However, this kind of error condition may cause the MPI job to fail to clean up all the rank processes.

semiuseless