|
[Sponsors] |
Unable to run OF in parallel on a multiple-node cluster |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
September 23, 2009, 12:09 |
Unable to run OF in parallel on a multiple-node cluster
|
#1 |
New Member
S Kumar
Join Date: Mar 2009
Posts: 13
Rep Power: 17 |
Hi,
I have been trying to run OF in parallel on our linux cluster. I am using the decomposeParDict. Our cluster has several nodes of 8 processors each, and while I am able to run OF in parallel on a single node (up to 8 processors), what happens when I try to run it on more than 1 node is that all the jobs still end up running on just one node. So, for example, when I submit the job to a queue (after using decomposePar) for 16 processors on 2 nodes, all 16 jobs run on a single node. All the nodes are nfs mounted with passwordless access. I have tried both using the roots field and not using it (even though the filesystem is common), but it hasn't made a difference. Please help. Thanks, Kumar |
|
September 23, 2009, 16:21 |
|
#2 |
Senior Member
Anton Kidess
Join Date: May 2009
Location: Germany
Posts: 1,377
Rep Power: 30 |
How do you execute OpenFOAM?
If you use PBS, your submission script should look like: #PBS -l nodes=2:ppn=8,walltime=24:00:00 cd $PBS_O_WORKDIR $MPI_ARCH_PATH/bin/mpirun -np $NPROCS --hostfile $PBS_NODEFILE $SOLVER -parallel You can also try to use the -nooversubscribe flag for MPI. |
|
September 24, 2009, 03:32 |
|
#3 |
New Member
S Kumar
Join Date: Mar 2009
Posts: 13
Rep Power: 17 |
Hi,
I figured out the problem. I was not using the -hostfile option. I was using the -np $NPROCS option, and without the -hostfile option mpirun was only picking up the first node specified in the hostfile. Now the parallel run works properly. Thanks for the response! Kumar |
|
November 24, 2009, 13:37 |
|
#4 |
Senior Member
xinguang cui
Join Date: Mar 2009
Posts: 116
Rep Power: 17 |
Hey quartzian and akidess:
I also have met similiar problem, please give me some advice. Thanks a lot. There is gcc4.4.1 and openmpi 1.3.3 in the cluster. I should use all of the complier in the cluster. I am not allowed to use any complier in the thirdparty when the code is run in multi-nodes. So, I need to compile the openfoam with the compile in cluster. I have to set up in my home directory by myself. First question, Are gcc 4.4.1 and openmpi 1.3.3 enough for compile openfoam 1.5? Second question, if it is enough , how do I do it? I have changed the complier option and mpi setting up option. As follows # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # WM_COMPILER_INST = OpenFOAM | System # WM_COMPILER_INST=OpenFOAM (Orginal) WM_COMPILER_INST=System (changed) case "$WM_MPLIB" in OPENMPI) mpi_version=openmpi-1.2.6 # export MPI_HOME=$WM_THIRD_PARTY_DIR/$mpi_version # export MPI_HOME=$WM_THIRD_PARTY_DIR/$mpi_version (orginal) # export MPI_ARCH_PATH=$MPI_HOME/platforms/$WM_OPTIONS (orginal export MPI_ARCH_PATH=/opt/openmpi/1.3.3/gcc-4.4.1 (changed) It could compile. But there is some error information as follows: Note: ignore spurious warnings about missing mpicxx.h headers + wmake libso mpi Making dependency list for source file OPwrite.C could not open file ompi/mpi/cxx/pmpicxx.h for source file OPwrite.C could not open file ompi/mpi/cxx/constants.h for source file OPwrite.C could not open file ompi/mpi/cxx/functions.h for source file OPwrite.C could not open file ompi/mpi/cxx/datatype.h for source file OPwrite.C could not open file ompi/mpi/cxx/exception.h for source file Opwrite.C Who would like to give me some idea about it? |
|
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Cant run in parallel on two nodes using OpenMPI | CHristofer | Main CFD Forum | 0 | October 26, 2007 09:54 |
Run in parallel a 2mesh case | cosimobianchini | OpenFOAM Running, Solving & CFD | 2 | January 11, 2007 06:33 |
Serial run OK parallel one fails | r2d2 | OpenFOAM Running, Solving & CFD | 2 | November 16, 2005 12:44 |
How to run parallel in ICEM_CFD? | Kiddo | Main CFD Forum | 2 | January 24, 2005 08:53 |
DHCP, Cluster computing, Parallel processing ? | Jihwan | Siemens | 1 | January 2, 2005 17:32 |