ansaurus

Question

Test MPI on a cluster

Answer 1

+2 A:

hi

couple things: you need to tell mpi where to launch processes, assuming you are using mpich, look at mpiexec help section and find machine file or equivalent description. Unless machine file is provided, it will run on one host

PBS automatically creates nodes file. Its name is stored in PBS_NODEFILE environment variable available in PBS command file. Try the following:

mpiexec -machinefile $PBS_NODEFILE ...

if you are using mpich2, you have two boot your mpi runtime using mpdboot. I do not remember the details of command, you will have to read man page. Remember to create secret file otherwise mpdboot will fail.

I read your post again, you will use open mpi, you still have to supply machines file to mpiexec command, but you do not have to mess with mpdboot

aaa 2010-01-31 02:49:50

Thank you so much! That solved my problem. If someone else have reserved some nodes for themselves or some nodes are now running other jobs, will mpiexec provided with $PBS_NODEFILE as machinefile notice that? Also could you try to answer my questions posted on http://superuser.com/questions/102812/torch-in-cluster about using PBS on a cluster? Thanks in advance!

Tim 2010-01-31 03:46:52

Answer 2

+1 A:

Hi

As a diagnostic, try inserting these statements immediately after your call to MPI_GET_PROCESSOR_NAME.

printf("Hello, world.  I am %d of %d on %s\n", myid, numprocs, name);
fflush(stdout);

If all processes return the same node id to that, it would suggest to me that you don't quite understand what is going on on the job management system and cluster -- perhaps PBS is (despite you apparently telling it otherwise) putting all 10 processes on one node (do you have 10 cores in a node ?).

If this produces different results, that suggests to me something wrong with your code, though it looks OK to me.

Regards

Mark

High Performance Mark 2010-01-31 03:20:17

Thanks Mark. I am not familiar with PBS either and now learning it too. If possible, could you try to answer my questions about PBS/Torch at http://superuser.com/questions/102812/torch-in-cluster ?

Tim 2010-01-31 03:49:06

Sorry Tim, I'm a Grid Engine user, only the vaguest knowledge of PBS.

High Performance Mark 2010-01-31 03:52:32

Answer 3

+1 A:

hi

By default PBS (I am assuming torque) allocates nodes in exclusive mode, so that only one job per node. It is a bit different if you have multiple processors, most likely one process per CPU. PBS can be changed to allocate nod in time-sharing mode, look at man page of qmgr.long story short, most likely you will not have overlapping nodes in node file, since node file is created when resources are available rather than at time of submission.

the purpose of PBS is resource control, most commonly time, node allocation (automatic).

commands in PBS file are executed sequentially. You can put processes in background, but that might be defeating purpose of resource allocation, but I do not know your exact workflow. I used the background processes in PBS scripts to copy data before main program runs in parallel, using &. PBS script is actually just a shell script.

you can assume that PBS does not know anything about inner workings off your script. You can certainly run multiple processes/threads in the via script.if you do so, that is up to you and your operating system to allocate core/processors in balanced fashion. If you are on multithreaded program, most likely approach is to run one mpi process for node and then spawn OpenMP threads.

Let me know if you need clarifications

aaa 2010-01-31 04:36:45

Thanks again! In my real project, I would like to run a executable several times and all in background in a single PBS script. However when I tried this way on the simple example gave above, there are some processes aborted. Please see my update to the original post. How to run background jobs in a PBS script? Thanks in advance!

Tim 2010-01-31 05:58:56

@Timhard to tell, looks like mpi job is being terminated externally. How much time do you allocate in PBS script, it is my first guess

aaa 2010-01-31 06:22:30

walltime is set to be 1:10:00. I think that is long enough for the running with sleep(30). Do you see anything wrong about specifying background jobs in the PBS script?

Tim 2010-01-31 06:36:10

@Timtim, probably have to assume first mpi launch is being killed by the second.I know less about OpenMPI than about mpich, but perhaps the default in OpenMPI is to launch only single mpi instance per node. Here is which can try: split node file such that to mpi launches do not have same nodes. If that works, look through open mpi manual to see how-to run multiple jobs on the same machine

aaa 2010-01-31 16:04:30

ansaurus

tags:

views:

answers:

Test MPI on a cluster

related questions