CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > FLUENT

problem of running parallel Fluent on linux cluster

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
Old   July 22, 2009, 16:33
Default problem of running parallel Fluent on linux cluster
  #1
Member
 
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 8
ivanbuz is on a distinguished road
the case runs fine if I require several processors on the SAME node, but if the processors are on different nodes, I have the "Connection refused" problem.

I search online and see that some people have the similar problem, but I can not find a solution to this specific problem. the output from Fluent and the submission script are attached below.

Thanks in advance!


OUTPUT FROM FLUENT
-----------------------------------------
/opt/hpc/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 -pib -cnf=/var/spool/PBS/aux//3666504.cmgr01 -g 2ddp -t6 -i test2.jou
/opt/hpc/Fluent.Inc/fluent6.3.26/cortex/lnamd64/cortex.3.7.3 -f fluent -g -i test2.jou (fluent "2ddp -pib -host -r6.3.26 -t6 -mpi=hp -cnf=/var/spool/PBS/aux//3666504.cmgr01 -path/opt/hpc/Fluent.Inc")
Loading "/opt/hpc/Fluent.Inc/fluent6.3.26/lib/fluent.dmp.114-64"
Done.
/opt/hpc/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 2ddp -pib -host -t6 -mpi=hp -cnf=/var/spool/PBS/aux//3666504.cmgr01 -path/opt/hpc/Fluent.Inc -cx scw-029.i:59263:37434
Starting /opt/hpc/Fluent.Inc/fluent6.3.26/lnamd64/2ddp_host/fluent.6.3.26 host -cx scw-029.i:59263:37434 "(list (rpsetvar (QUOTE parallel/function) "fluent 2ddp -node -r6.3.26 -t6 -pib -mpi=hp -cnf=/var/spool/PBS/aux//3666504.cmgr01 ") (rpsetvar (QUOTE parallel/rhost) "") (rpsetvar (QUOTE parallel/ruser) "") (rpsetvar (QUOTE parallel/nprocs_string) "6") (rpsetvar (QUOTE parallel/auto-spawn?) #t) (rpsetvar (QUOTE parallel/trace-level) 0) (rpsetvar (QUOTE parallel/remote-shell) 0) (rpsetvar (QUOTE parallel/path) "/opt/hpc/Fluent.Inc") (rpsetvar (QUOTE parallel/hostsfile) "/var/spool/PBS/aux//3666504.cmgr01") )"
Welcome to Fluent 6.3.26
Copyright 2006 Fluent Inc.
All Rights Reserved
Loading "/opt/hpc/Fluent.Inc/fluent6.3.26/lib/flprim.dmp.1119-64"
Done.

Host spawning Node 0 on machine "scw-029" (unix).
/opt/hpc/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 2ddp -node -t6 -pib -mpi=hp -cnf=/var/spool/PBS/aux//3666504.cmgr01 -mport 192.168.2.129:192.168.2.129:34193:0
Starting /opt/hpc/Fluent.Inc/fluent6.3.26/multiport/mpi/lnamd64/hp/bin/mpirun -prot -vapi -e MPI_HASIC_VAPI=1 -e MPI_USE_MALLOPT_SBRK_PROTECTION=1 -e MPI_USE_MALLOPT_AVOID_MMAP=1 -f /tmp/fluent-appfile.32087
192.168.2.135: Connection refused
mpirun: Warning one more more remote shell commands exited with non-zero status, which may indicate a remote access problem.





SUBMISSION SCRIPT
-----------------------------------------
#!/bin/sh
#PBS -j oe
#PBS -l nodes=2:ppn=3
#PBS -q main
#PBS -l walltime=00:10:00
cd ${PBS_O_WORKDIR}
cat ${PBS_NODEFILE}
#Set variables for script
# What version of the solver to use
FLUENTSOLVER=2ddp
#HOW MANY CPUS- note that you'll still need to update the $PBS -l nodes line
CPUCOUNT=6
#Which input journal file to use to give fluent?
#INPUT=${PBS_O_WORKDIR}/${PBS_JOBNAME}
INPUT=test2.jou
#Where do we want to put output at?
OUTPUT=${PBS_O_WORKDIR}/${PBS_JOBID}.out

# Run Fluent with:
# -pib use Infiniband parallel
# -cnf=$PBS_NODEFILE get the list of machines PBS is running on from the server
# -t$CPUCOUNT use $CPUCOUNT CPUs total
# -g no graphics, batch mode
# -i read the file in $INPUT
# > $OUTPUT 2>&1 Redirect program output to a file in your home directory.
fluent $FLUENTSOLVER -t$CPUCOUNT -pib cnf=$PBS_NODEFILE -g -i $INPUT > $OUTPUT 2>&1

Last edited by ivanbuz; July 22, 2009 at 16:35. Reason: display error
ivanbuz is offline   Reply With Quote

Old   July 23, 2009, 01:32
Default
  #2
Super Moderator
 
-mAx-'s Avatar
 
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 2,973
Rep Power: 30
-mAx- will become famous soon enough
Are you able to connect to your nodes with rsh or ssh?
__________________
In memory of my friend Hervé: CFD engineer & freerider
-mAx- is offline   Reply With Quote

Old   July 23, 2009, 03:15
Default
  #3
Member
 
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 8
ivanbuz is on a distinguished road
yes, I have access to all nodes using SSH
ivanbuz is offline   Reply With Quote

Old   July 23, 2009, 03:44
Default
  #4
Super Moderator
 
-mAx-'s Avatar
 
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 2,973
Rep Power: 30
-mAx- will become famous soon enough
passwords aren't required?
__________________
In memory of my friend Hervé: CFD engineer & freerider
-mAx- is offline   Reply With Quote

Old   July 23, 2009, 14:07
Default
  #5
Member
 
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 8
ivanbuz is on a distinguished road
I log into the cluster before submitting the job. so there should be no problem with password.
ivanbuz is offline   Reply With Quote

Old   July 23, 2009, 14:10
Default
  #6
Member
 
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 8
ivanbuz is on a distinguished road
If guys from Fluent.INC see this thread, can you take a look?
ivanbuz is offline   Reply With Quote

Old   July 24, 2009, 01:19
Default
  #7
Super Moderator
 
-mAx-'s Avatar
 
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 2,973
Rep Power: 30
-mAx- will become famous soon enough
I am not an linux expert / fluent-parallel , but you said that you are using SSH.
SSH needs password.
if you try a command on a node, what is the result?
> ssh ip-address ls
__________________
In memory of my friend Hervé: CFD engineer & freerider
-mAx- is offline   Reply With Quote

Old   July 24, 2009, 15:10
Default
  #8
Member
 
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 8
ivanbuz is on a distinguished road
Hi, mAx, There is no problem. the output is similar to the following

bushivan@scw-097:~>ssh 192.166.2.195 ls
airfoil
airfoil_ins_eig
default_id.dbs
...
ivanbuz is offline   Reply With Quote

Old   July 24, 2009, 16:35
Default
  #9
Super Moderator
 
-mAx-'s Avatar
 
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 2,973
Rep Power: 30
-mAx- will become famous soon enough
Is there an administrator under your Cluster or did you set it yourself?
__________________
In memory of my friend Hervé: CFD engineer & freerider
-mAx- is offline   Reply With Quote

Old   July 24, 2009, 17:43
Default
  #10
Member
 
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 8
ivanbuz is on a distinguished road
The cluster is operated by the HPC center. I just submit my job there. I actually talked to people of HPC, and they said something like the MPI used by the clusters is not compatible with Fluent, but I am not sure if they are right. So I just post my problem here and see if anyone has encountered the same problem.
ivanbuz is offline   Reply With Quote

Old   July 25, 2009, 02:15
Default
  #11
Super Moderator
 
-mAx-'s Avatar
 
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 2,973
Rep Power: 30
-mAx- will become famous soon enough
Then they are the right people to solve your problem.
But it may be sad not being able to use fluent's parallel enhancement with this cluster
__________________
In memory of my friend Hervé: CFD engineer & freerider
-mAx- is offline   Reply With Quote

Old   March 10, 2010, 16:13
Default
  #12
New Member
 
Ibad Kureshi
Join Date: Mar 2010
Posts: 5
Rep Power: 7
ibnkureshi is on a distinguished road
i know this is late, but you have to give the '-ssh' in the fluent command line in the submission file. that forces fluent to use ssh rather than rsh which it always goes to by default
ibnkureshi is offline   Reply With Quote

Reply

Tags
cluster, fluent, parallel

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
parallel fluent runs being killed at partitioing Ben Aga FLUENT 3 June 8, 2012 10:40
Running on Distibuted Memory linux itanium cluster Josh FLUENT 0 January 29, 2007 01:18
running multiple Fluent parallel jobs Michael Bo Hansen FLUENT 8 June 7, 2006 08:52
Fluent Parallel for Linux? Rajil Saraswat FLUENT 0 June 11, 2003 10:53
Time problem in parallel fluent MZB FLUENT 1 May 13, 2003 02:37


All times are GMT -4. The time now is 13:57.