I want to easily perform collective communications indepandently on each machine of my cluster. Let say I have 4 machines with 8 cores on each, my mpi program would run 32 MPI tasks. What I would like is, for a given function:
- on each host, only one task perform a computation, other tasks do nothing during this computation. In my example, 4 MPI tasks will do the computation, 28 others are waiting.
- once the computation is done, each MPI tasks on each will perform a collective communication ONLY to local tasks (tasks running on the same host).
Conceptually, I understand I must create one communicator for each host. I searched around, and found nothing explicitly doing that. I am not really comfortable with MPI groups and communicators. Here my two questions:
- is MPI_Get_processor_name is enough unique for such a behaviour?
- more generally, do you have a piece of code doing that?
Thanks