CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > OpenFOAM Running, Solving & CFD

Running parallel job using qsub on sun grid engine

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
Old   February 6, 2008, 16:47
Default Hi all I need some help from
  #1
Senior Member
 
Nishant
Join Date: Mar 2009
Location: Glasgow, UK
Posts: 165
Rep Power: 8
nishant_hull is on a distinguished road
Hi all
I need some help from you about some open Mpi problem.
I am trying to run a program on my AMD64 cluster at university computation facilty.
My problem is running fine using command:
mpirun -machinefile machine -np 4 case root etc

where -machinefile is a manually generated script. But I am trying to run it on cluster using qsub command with automatically allocated machines (not the master node necessarily). for this I write this qsub-file script. I used mpich here and write hostfile/machinefile.

#!/bin/sh
#$ -N MPICH_JOB
#$ -cwd
# Join stdout and stderr
#$ -j y
# pe request for MPICH. Set your number of processors here.
# Make sure you use the "mpich" parallel environemnt.
#$ -pe mpich 4
#
# Run job through bash shell
#$ -S /bin/bash
#
# The following is for reporting only. It is not really needed
# to run the job. It will show up in your output file.
echo "Got $NSLOTS processors."
echo "Machines:"
# add here code to map regular hostnames into ATM hostnames
#echo $TMPDIR/machines
cat $PE_HOSTFILE
mpirun -machinefile machine -np 4 case root etc

From this script, I am getting the hostfile name in this format:
comp03.dcs.hull.ac.uk 1 parallel.q@comp03.dcs.hull.ac.uk <null>
comp29.dcs.hull.ac.uk 1 parallel.q@comp29.dcs.hull.ac.uk <null>
comp11.dcs.hull.ac.uk 1 parallel.q@comp11.dcs.hull.ac.uk <null>
comp09.dcs.hull.ac.uk 1 parallel.q@comp09.dcs.hull.ac.uk <null>

But my open Mpi implementation need it in this way:-
comp00.dcs.hull.ac.uk slots=2 max-slots=2
comp03.dcs.hull.ac.uk slots=2 max-slots=2
comp04.dcs.hull.ac.uk slots=2 max-slots=2
comp05.dcs.hull.ac.uk slots=2 max-slots=2


Can you please suggest me something about it? If there is any material to read or so then let me know. Any kind of help will be helpful.

Also, I like to ask from the experts, Is this possible with the current code?

looking forward to your help in this regard.

with warm regards,

Nishant Singh
__________________
Thanks and regards,

Nishant
nishant_hull is offline   Reply With Quote

Old   February 7, 2008, 04:33
Default Nishant, To run parallel Op
  #2
Member
 
Michele Vascellari
Join Date: Mar 2009
Posts: 70
Rep Power: 8
mighelone is on a distinguished road
Nishant,

To run parallel OpenFoam jobs under qsub (Torque version) I use the following script:

#!/bin/bash
#PBS -N damBreakFine
#PBS -l nodes=4
CASE=damBreakFine
SOLVER=interFoam

CURDIR=$HOME/OpenFOAM/michele-1.4.1/run/tutorials/interFoam
cd $CURDIR
mpirun --machinefile $PBS_NODEFILE $SOLVER $CURDIR $CASE -parallel

The variable $PBS_NODEFILE defines the path of the file where the nodes used for the run are stored.

Generally using qsub command you don't know which nodes will be used for the run, so you can not define at priori the machine file.

Michele
mighelone is offline   Reply With Quote

Old   February 7, 2008, 08:02
Default Thanks Michele, Unfortunat
  #3
Senior Member
 
Nishant
Join Date: Mar 2009
Location: Glasgow, UK
Posts: 165
Rep Power: 8
nishant_hull is on a distinguished road
Thanks Michele,

Unfortunately my cluster is not pbs supported. As you can see my script. Can you suggest something which could replace $PBS_NODEFILE for my case. Or else, Is there any way to make cluster to support pbs script?

Nishant
__________________
Thanks and regards,

Nishant
nishant_hull is offline   Reply With Quote

Old   February 7, 2008, 09:59
Default Sorry Nushant, I don't obse
  #4
Member
 
Michele Vascellari
Join Date: Mar 2009
Posts: 70
Rep Power: 8
mighelone is on a distinguished road
Sorry Nushant,

I don't observe that you're using grid engine as resource manager and not torque.
I'm sorry, but I don't have any experience on qsub on grid engine.

Michele
mighelone is offline   Reply With Quote

Old   February 7, 2008, 10:14
Default Nishant, What was so wrong
  #5
Senior Member
 
Mark Olesen
Join Date: Mar 2009
Location: http://olesenm.github.io/
Posts: 777
Rep Power: 18
olesen will become famous soon enough
Nishant,

What was so wrong with the old thread ( http://www.cfd-online.com/cgi-bin/OpenFOAM_Discus/show.cgi?1/6504 ) that warranted starting a completely new thread for this discussion?

IMO it gave fairly reasonable reasonable information and was not exactly out-of-date.
olesen is offline   Reply With Quote

Old   February 7, 2008, 15:52
Default Hi Mark Thanks for the rep
  #6
Senior Member
 
Nishant
Join Date: Mar 2009
Location: Glasgow, UK
Posts: 165
Rep Power: 8
nishant_hull is on a distinguished road
Hi Mark

Thanks for the reply. In fact I go through that as well. But I can not understand those codes at first hand. I would appreciate if you can please brief me, how to run parallel foam cases on SGE cluster using QSUB command. I can see some piece of code there but I can not exactly figure out how to implement it in my case.

I am briefing you wot I undersatnd out of it. Actually I do not exactly get what this piece of code is doing here?

PeHostfile2MachineFile()
{
cat $1 | while read line; do
# echo $line
host=`echo $line|cut -f1 -d" "|cut -f1 -d"."`
nslots=`echo $line|cut -f2 -d" "`
i=1
# while [ $i -le $nslots ]; do
# # add here code to map regular hostnames into ATM hostnames
echo $host cpu=$nslots
# i=`expr $i + 1`
# done
done
}
touch OFmachines
PeHostfile2MachineFile $1 | cat >> OFmachines
mhost=`echo $2|cut -f1 -d"."`
echo $mhost >> mhost

AGAIN, I do not understand why qFoam-Snippet is required and where to use it. Bcoz I am actually looking for just a qsub run script. Sorry If I sound very naive.
I understand a bit of the piece of code underneath, which says:-

#!/bin/bash
echo Enter a casename:
read casename
echo "Enter definition WDir:"
read Wdir
#echo Enter Solver :
#read Solver
echo "Number of processors:"
read cpunumb
#
if [ $cpunumb = "1" ]; then
touch Foam-$casename.sh
chmod +x Foam-$casename.sh
echo '#!/bin/bash' >> Foam-$casename.sh
echo '### SGE ###' >> Foam-$casename.sh
echo '#$ -S /bin/sh -j y -cwd' >> Foam-$casename.sh
echo 'read masthost <mhost'>> Foam-$casename.sh
echo 'ssh $masthost "cd $PWD;'SteadyCompFoam' '$Wdir' '$casename' "' >> OFoam-$casename.sh
echo 'rm -f OFmachines' >> Foam-$casename.sh
echo 'rm -f mhost' >> Foam-$casename.sh
echo 'rm -f 'Foam-$casename.sh' ' >> Foam-$casename.sh
qsub -pe OFnet $cpunumb -masterq tom02.q,tom03.q,tom04.q,tom05.q,tom06.q,tom22.q,to m23.q,tom24.q,tom25.
q Foam-$casename.sh
else
touch Foam-$casename.sh
chmod +x Foam-$casename.sh
echo '#!/bin/bash' >> Foam-$casename.sh
echo '### SGE ###' >> Foam-$casename.sh
echo '#$ -S /bin/sh -j y -cwd' >> Foam-$casename.sh
echo 'read masthost <mhost'>> Foam-$casename.sh
echo 'ssh $masthost "export LAMRSH=ssh;cd $PWD;lamboot -v -s OFmachines"' >> Foam-$c
asename.sh
echo 'ssh $masthost "cd $PWD;mpirun -np '$cpunumb' 'SteadyCompFoam' '$Wdir' '$casename' -parallel" ' >>
Foam-$casename.sh
echo 'ssh $masthost "cd $PWD;lamhalt -d"' >> Foam-$c
asename.sh
echo 'rm -f OFmachines' >> Foam-$casename.sh
echo 'rm -f mhost' >> Foam-$casename.sh
echo 'rm -f 'Foam-$casename.sh' ' >> Foam-$casename.sh
qsub -pe OFnet $cpunumb -masterq tom02.q,tom03.q,tom04.q,tom05.q,tom06.q,tom22.q,to m23.q,tom24.q,tom25.
q Foam-$casename.sh
fi

BUT I DONT GET, How it can help in my case. What OFnet means? Also it is for LAM implementation and I am using OpenMpi.

Please suggest, How can I can proceed here?

Nishant
__________________
Thanks and regards,

Nishant
nishant_hull is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Parallel running of Fluent Bhanu Gupta FLUENT 3 April 7, 2011 09:32
Running in parallel Rasmus Gjesing (Gjesing) OpenFOAM 35 March 31, 2011 18:21
Problem in running Parallel mamaly60 OpenFOAM Running, Solving & CFD 1 April 19, 2010 11:11
slow down running in parallel laf FLUENT 1 April 4, 2007 02:48
Postprocessing after running in parallel balakrishnan OpenFOAM Pre-Processing 0 March 11, 2005 12:22


All times are GMT -4. The time now is 18:26.