views:

202

answers:

0

I have a Debian cluster with 2 nodes and two quad-core processors each. I use Torque and Maui as scheduler. When I try to run an MPI job with 16 processes, the scheduler is not able to run the job: either it puts it to the queue (although there is not any job runing at that moment) or runs and the resulting output file says that you was trying to run a 16 processes job with only 4 processors.

my .../pbs/server_priv/nodes file looks as follows:

node1 np=8
node2 np=8

and an example of the script I'm using to run the program is the following:

#!/bin/sh



#PBS -d /home/bellman/

#PBS -N output
#PBS -k oe
#PBS -j oe
#PBS -l nodes=2:ppn=8,walltime=10000:00:00
#PBS -V

ulimit -s 536870912

# How many procs do I have?
NP=$(wc -l $PBS_NODEFILE | awk '{print $1}')
echo Number of processors is $NP


mpiexec -np 16 /home/bellman/AAA

I tried lots of combinations of nodes and ppn, but one of the two errors happen. Any ideas on what is going on?