Unable to run OF in parallel on a multiple-node cluster
I have been trying to run OF in parallel on our linux cluster. I am using the decomposeParDict. Our cluster has several nodes of 8 processors each, and while I am able to run OF in parallel on a single node (up to 8 processors), what happens when I try to run it on more than 1 node is that all the jobs still end up running on just one node. So, for example, when I submit the job to a queue (after using decomposePar) for 16 processors on 2 nodes, all 16 jobs run on a single node.
All the nodes are nfs mounted with passwordless access.
I have tried both using the roots field and not using it (even though the filesystem is common), but it hasn't made a difference.
How do you execute OpenFOAM?
If you use PBS, your submission script should look like:
#PBS -l nodes=2:ppn=8,walltime=24:00:00
$MPI_ARCH_PATH/bin/mpirun -np $NPROCS --hostfile $PBS_NODEFILE $SOLVER -parallel
You can also try to use the -nooversubscribe flag for MPI.
I figured out the problem. I was not using the -hostfile option. I was using the -np $NPROCS option, and without the -hostfile option mpirun was only picking up the first node specified in the hostfile.
Now the parallel run works properly.
Thanks for the response!
Hey quartzian and akidess:
I also have met similiar problem, please give me some advice. Thanks a lot.
There is gcc4.4.1 and openmpi 1.3.3 in the cluster.
I should use all of the complier in the cluster. I am not allowed to use any complier in the thirdparty when the code is run in multi-nodes.
So, I need to compile the openfoam with the compile in cluster.
I have to set up in my home directory by myself.
First question, Are gcc 4.4.1 and openmpi 1.3.3 enough for compile openfoam 1.5?
Second question, if it is enough , how do I do it?
I have changed the complier option and mpi setting up option. As follows
# WM_COMPILER_INST = OpenFOAM | System
# WM_COMPILER_INST=OpenFOAM (Orginal)
case "$WM_MPLIB" in
# export MPI_HOME=$WM_THIRD_PARTY_DIR/$mpi_version
# export MPI_HOME=$WM_THIRD_PARTY_DIR/$mpi_version (orginal)
# export MPI_ARCH_PATH=$MPI_HOME/platforms/$WM_OPTIONS (orginal
export MPI_ARCH_PATH=/opt/openmpi/1.3.3/gcc-4.4.1 (changed)
It could compile. But there is some error information as follows:
Note: ignore spurious warnings about missing mpicxx.h headers
+ wmake libso mpi
Making dependency list for source file OPwrite.C
could not open file ompi/mpi/cxx/pmpicxx.h for source file OPwrite.C
could not open file ompi/mpi/cxx/constants.h for source file OPwrite.C
could not open file ompi/mpi/cxx/functions.h for source file OPwrite.C
could not open file ompi/mpi/cxx/datatype.h for source file OPwrite.C
could not open file ompi/mpi/cxx/exception.h for source file Opwrite.C
Who would like to give me some idea about it?
|All times are GMT -4. The time now is 23:50.|