CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > OpenFOAM Running, Solving & CFD

Script to Run Parallel Jobs in Rocks Cluster

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
Old   January 10, 2009, 13:19
Default Hello All, Is there a simpl
  #1
Member
 
a a saha
Join Date: Mar 2009
Posts: 67
Rep Power: 8
asaha is on a distinguished road
Hello All,

Is there a simple script available that can be used to run OpenFOAM 1.5 parallel jobs in CentOS PE-Rocks cluster with SGE.

I am running serial jobs in the cluster using

qsub -cwd -S /bin/bash ser_script.sh

ser_script.sh

#!/bin/bash
. $HOME/OpenFOAM/OpenFOAM-1.5/etc/basrhc
interFoam

and for parallel jobs I used

qsub -cwd -S /bin/bash par_script.sh

par_script.sh

#!/bin/bash
. $HOME/OpenFOAM/OpenFOAM-1.5/etc/bashrc
mpirun -np 2 interFoam -parallel

It seems my parallel script is not working well as the clock times for the parallel job (32797 s) is very high compared with that of the serial job (1385 s).

To check I ran the above serial (1588 s) and parallel (697 s) cases in the master node and observed that the case runs well in parallel mode.

I would like to know what I need to put in the parallel script so that the case gives a good parallel speed up when submitted through qsub.

Please advise.


Thanks and regards,

a a saha
asaha is offline   Reply With Quote

Old   January 11, 2009, 01:22
Default Parallel submission requires h
  #2
Member
 
Velan
Join Date: Mar 2009
Location: India
Posts: 50
Rep Power: 8
velan is on a distinguished road
Parallel submission requires hostname of the local machine. Usually in PBS(Rocks),

mpirun -hostfile `cat $PBS_NODEFILE` -np `cat $PBS_NODEFILE | wc -l` interFoam -parallel > output.log

I never tried SGE. But you can do by getting those machine names, like

mpirun -host machinefile -np 2 interFoam -parallel > output.log

Here machinefile should contains the hostname of the local machine (compute-node-01, compute-node-02....)
velan is offline   Reply With Quote

Old   January 11, 2009, 03:08
Default Hello Velan, Thanks for you
  #3
Member
 
a a saha
Join Date: Mar 2009
Posts: 67
Rep Power: 8
asaha is on a distinguished road
Hello Velan,

Thanks for your post. Does the script for PBS (Rocks) optimises the allocation of the processors for parallel jobs?

Whereas, for SGE the indicated script will not allocate the processors for parallel jobs optimally as it will take first two of the machines listed in the machinefile and start processing.

Please correct me if I am missing something here.

Regards,

a a saha.
asaha is offline   Reply With Quote

Old   January 11, 2009, 07:02
Default Hi Saha, first of all, are
  #4
Member
 
Ville Tossavainen
Join Date: Mar 2009
Location: Helsinki, Finland
Posts: 60
Rep Power: 8
villet is on a distinguished road
Hi Saha,

first of all, are you maintaining the cluster by yourself? If not, you should ask your system administrator how to run parallel jobs on SGE. They provide the information how the SGE is configured - what is supported and what is not.

I presume you use OpenMPI which is the default MPI in OpenFOAM. Then you need to ask to use a parallel environment in the script (I ask for "openmpi"-named PE on our cluster and 8 slots):

#$ -pe openmpi 8
#$ -R y

The OpenMPI-specific way to call OpenFOAM is:

mpirun -np $NSLOTS interFoam -parallel

In your script SGE allocated only one slot, but OpenFOAM ran with two processes. That must have slowed the system.

Hope this helps,
Ville
villet is offline   Reply With Quote

Old   January 11, 2009, 13:45
Default Hello Ville, Thanks for you
  #5
Member
 
a a saha
Join Date: Mar 2009
Posts: 67
Rep Power: 8
asaha is on a distinguished road
Hello Ville,

Thanks for your advise. I checked up with the cluster documentation which says

for parallel jobs on n-processors to use:

$ qsub -cwd -S /bin/bash -pe mpi n your_script.sh

However, when I execute the following

$ qsub -cwd -S /bin/bash -pe openmpi 2 -R y par_script_new.sh

I get the following error

Unable to run job: job rejected: the requested parallel environment "openmpi" does not exist.

I think there is no support for openmpi parallel environment in the cluster.

Pl. suggest if there is a possible work around.

Regards,

a a saha
asaha is offline   Reply With Quote

Old   January 11, 2009, 14:01
Default Hello Saha, Can you please
  #6
Member
 
Velan
Join Date: Mar 2009
Location: India
Posts: 50
Rep Power: 8
velan is on a distinguished road
Hello Saha,

Can you please tell me, from where you are using machinefile ?. Send me the output of machinefile.

And meanwhile can you tell me openFOAM is installed/mounted in all compute nodes.

If possible send me the script, i will try to check it.
velan is offline   Reply With Quote

Old   January 11, 2009, 14:56
Default Hello Velan, OpenFOAM is in
  #7
Member
 
a a saha
Join Date: Mar 2009
Posts: 67
Rep Power: 8
asaha is on a distinguished road
Hello Velan,

OpenFOAM is installed in my home directory.
I have two scenario which can still run my parallel cases.

(1) Scenario 1 (no qsub)

I don't have any problem in executing parallel run using the following command:

mpirun -np 2 -hostfile machines interFoam -parallel

where I give the list of compute nodes in the machines file as below

compute-0-5 slots=1
compute-0-7 slots=1

If you see I do not use qsub for my parallel execution here. Clock time (1305 s)

(2) Scenario 2 (with qsub)

I use the following command

qsub -cwd -S /bin/bash par_script_new.sh

par_script_new.sh

#!/bin/bash
. $HOME/OpenFOAM/OpenFOAM-1.5/etc/bashrc
mpirun -np 2 -hostfile machines interFoam -parallel

The contents of machine file is same as in Scenario 1.

In this qsub allots 1 slot for my job and the parallel run is executed on two compute nodes as per the machines file. Clock time = 1271 s.


I do not think the above two methods are elegant way of executing parallel jobs in a cluster.

Is it possible to get the MPI parallel environemnt in OpenFOAM-1.5 instead of OpenMPI?
I am using the binary version here.

Regards,

a a saha.
asaha is offline   Reply With Quote

Old   January 11, 2009, 15:52
Default Hello Saha, I tried openfoa
  #8
Member
 
Velan
Join Date: Mar 2009
Location: India
Posts: 50
Rep Power: 8
velan is on a distinguished road
Hello Saha,

I tried openfoam with MPICH, but i never succeed in my cluster. I keep getting error. So i moved to default openMPI.

I am feeling bad, that i couldnt help you to resolve this issue. May be somebody can help you to solve it.
velan is offline   Reply With Quote

Old   January 11, 2009, 17:16
Default Saha: In your case I would
  #9
Member
 
Ville Tossavainen
Join Date: Mar 2009
Location: Helsinki, Finland
Posts: 60
Rep Power: 8
villet is on a distinguished road
Saha:

In your case I would contact the system administrator and ask which MPI are supported and how you should run OpenFOAM. You shouldn't just pass the job queue system. Parallel envinronment "openmpi" is just a name in my cluster and the configurations can vary.

I have used MPICH and OpenMPI in SGE, but there are many other options. If the system is compatible with OpenMPI, you may need to do a small modification. SGE support was dropped on the default compilation of OpenMPI some time ago. Adding the option "--with-sge" in "ThirdParty/Allwmake" (at the end of OpenMPI configure section) enables SGE support. Then you should run "./Allwmake" in ThirdParty-directory and make sure the compilation is successful.

Ville
villet is offline   Reply With Quote

Old   January 12, 2009, 14:00
Default Hello Ville, I could not co
  #10
Member
 
a a saha
Join Date: Mar 2009
Posts: 67
Rep Power: 8
asaha is on a distinguished road
Hello Ville,

I could not contact the system administrator today. It seems to me that openmpi, mpich, lam, mpich-hpl and openmpi-hpl are available in the cluster.

Is the following error I get is due to no SGE support in OpenMPI of OpenFOAM?
------------------------------------------------
Unable to run job: job rejected: the requested parallel environment "openmpi" does not exist.
------------------------------------------------

Please suggest if I must do the compilation in "ThirdParty/Allwmake" enabling SGE support so that I dont get the error.

Other wise I would try with OpenFOAM-1.4.1-dev instead which has default lam parallel environment.

Hello Velan:

Thanks for your help.


Regards,

a a saha.
asaha is offline   Reply With Quote

Old   January 12, 2009, 16:33
Default The error was just because the
  #11
Member
 
Ville Tossavainen
Join Date: Mar 2009
Location: Helsinki, Finland
Posts: 60
Rep Power: 8
villet is on a distinguished road
The error was just because there was no parallel environment called "openmpi" defined in your SGE system. You should use "-pe mpi n" on the command line instead (like in your example).

I'm not aware how the actual configurations of parallel environments differ between different MPIs. I'm just an user Usually there are PE for every MPI.
villet is offline   Reply With Quote

Old   January 16, 2009, 01:45
Default Hello Ville, I just conduct
  #12
Member
 
a a saha
Join Date: Mar 2009
Posts: 67
Rep Power: 8
asaha is on a distinguished road
Hello Ville,

I just conducted a check to find if OpenMPI pe environemnt is available with SGE on the OpenMPI versions available in Rocks and that comes with OpenFOAM-1.5.

I found by executing in OpenFOAM env that SGE support exits.

ompi_info | grep gridengine
MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.2.6)
MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.2.6)

Whereas, SGE support is not there with OpenMPI installed in Rocks. Probably due to this qsub is not accepting my parallel jobs.

So installed OpenFOAM-1.4.1-dev and using the following script I could successfully execute parallel jobs on the cluster.

qsub -cwd -S /bin/bash -pe mpi 4 par_script_new_4_cpu.sh

par_script_new_4_cpu.sh:

#!/bin/bash
date

. $HOME/OpenFOAM/OpenFOAM-1.4.1-dev/.OpenFOAM-1.4.1-dev/bashrc
sleep 180
export LAMRSH=ssh
lamboot -d $FOAM_RUN/parallel_test_4_cpu/system/machines
mpirun -np 4 interFoam $FOAM_RUN parallel_test_4_cpu -parallel

I use the sleep time of 180 during which I specify the nodes allocated by SGE to the machines file.

Thanks again for all the help.


Regards,

a a saha.
asaha is offline   Reply With Quote

Old   July 4, 2012, 22:51
Default Run OpenFOAM on cluster CentOS
  #13
New Member
 
ICST
Join Date: Mar 2011
Posts: 20
Rep Power: 6
phuchuynh is on a distinguished road
Hi everyone, I am using operating system CentOS cluster. I ran OF (OpenFoam) on PC serial. Currently, I would like to run it on cluster, but I dont know how to write script to run OF ? Can everyone help me ?
thanks
phuchuynh
phuchuynh is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
script file for running fluent on linux cluster Worth FLUENT 2 February 9, 2012 12:31
HELP: submitting FLUENT JOBS ON SGE CLUSTER James Willie FLUENT 0 January 27, 2006 06:42
Help please: submitting fluent jobs to sgi cluster James Willie FLUENT 0 January 27, 2006 06:38
TASCflow,problem with script and parallel mode Zbynek Hrncir CFX 0 October 2, 2001 07:30


All times are GMT -4. The time now is 00:43.