|
[Sponsors] |
problem of running parallel Fluent on linux cluster |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
July 22, 2009, 16:33 |
problem of running parallel Fluent on linux cluster
|
#1 |
Member
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 17 |
the case runs fine if I require several processors on the SAME node, but if the processors are on different nodes, I have the "Connection refused" problem.
I search online and see that some people have the similar problem, but I can not find a solution to this specific problem. the output from Fluent and the submission script are attached below. Thanks in advance! OUTPUT FROM FLUENT ----------------------------------------- /opt/hpc/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 -pib -cnf=/var/spool/PBS/aux//3666504.cmgr01 -g 2ddp -t6 -i test2.jou /opt/hpc/Fluent.Inc/fluent6.3.26/cortex/lnamd64/cortex.3.7.3 -f fluent -g -i test2.jou (fluent "2ddp -pib -host -r6.3.26 -t6 -mpi=hp -cnf=/var/spool/PBS/aux//3666504.cmgr01 -path/opt/hpc/Fluent.Inc") Loading "/opt/hpc/Fluent.Inc/fluent6.3.26/lib/fluent.dmp.114-64" Done. /opt/hpc/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 2ddp -pib -host -t6 -mpi=hp -cnf=/var/spool/PBS/aux//3666504.cmgr01 -path/opt/hpc/Fluent.Inc -cx scw-029.i:59263:37434 Starting /opt/hpc/Fluent.Inc/fluent6.3.26/lnamd64/2ddp_host/fluent.6.3.26 host -cx scw-029.i:59263:37434 "(list (rpsetvar (QUOTE parallel/function) "fluent 2ddp -node -r6.3.26 -t6 -pib -mpi=hp -cnf=/var/spool/PBS/aux//3666504.cmgr01 ") (rpsetvar (QUOTE parallel/rhost) "") (rpsetvar (QUOTE parallel/ruser) "") (rpsetvar (QUOTE parallel/nprocs_string) "6") (rpsetvar (QUOTE parallel/auto-spawn?) #t) (rpsetvar (QUOTE parallel/trace-level) 0) (rpsetvar (QUOTE parallel/remote-shell) 0) (rpsetvar (QUOTE parallel/path) "/opt/hpc/Fluent.Inc") (rpsetvar (QUOTE parallel/hostsfile) "/var/spool/PBS/aux//3666504.cmgr01") )" Welcome to Fluent 6.3.26 Copyright 2006 Fluent Inc. All Rights Reserved Loading "/opt/hpc/Fluent.Inc/fluent6.3.26/lib/flprim.dmp.1119-64" Done. Host spawning Node 0 on machine "scw-029" (unix). /opt/hpc/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 2ddp -node -t6 -pib -mpi=hp -cnf=/var/spool/PBS/aux//3666504.cmgr01 -mport 192.168.2.129:192.168.2.129:34193:0 Starting /opt/hpc/Fluent.Inc/fluent6.3.26/multiport/mpi/lnamd64/hp/bin/mpirun -prot -vapi -e MPI_HASIC_VAPI=1 -e MPI_USE_MALLOPT_SBRK_PROTECTION=1 -e MPI_USE_MALLOPT_AVOID_MMAP=1 -f /tmp/fluent-appfile.32087 192.168.2.135: Connection refused mpirun: Warning one more more remote shell commands exited with non-zero status, which may indicate a remote access problem. SUBMISSION SCRIPT ----------------------------------------- #!/bin/sh #PBS -j oe #PBS -l nodes=2:ppn=3 #PBS -q main #PBS -l walltime=00:10:00 cd ${PBS_O_WORKDIR} cat ${PBS_NODEFILE} #Set variables for script # What version of the solver to use FLUENTSOLVER=2ddp #HOW MANY CPUS- note that you'll still need to update the $PBS -l nodes line CPUCOUNT=6 #Which input journal file to use to give fluent? #INPUT=${PBS_O_WORKDIR}/${PBS_JOBNAME} INPUT=test2.jou #Where do we want to put output at? OUTPUT=${PBS_O_WORKDIR}/${PBS_JOBID}.out # Run Fluent with: # -pib use Infiniband parallel # -cnf=$PBS_NODEFILE get the list of machines PBS is running on from the server # -t$CPUCOUNT use $CPUCOUNT CPUs total # -g no graphics, batch mode # -i read the file in $INPUT # > $OUTPUT 2>&1 Redirect program output to a file in your home directory. fluent $FLUENTSOLVER -t$CPUCOUNT -pib cnf=$PBS_NODEFILE -g -i $INPUT > $OUTPUT 2>&1 Last edited by ivanbuz; July 22, 2009 at 16:35. Reason: display error |
|
July 23, 2009, 01:32 |
|
#2 |
Super Moderator
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 3,297
Rep Power: 41 |
Are you able to connect to your nodes with rsh or ssh?
__________________
In memory of my friend Hervé: CFD engineer & freerider |
|
July 23, 2009, 03:15 |
|
#3 |
Member
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 17 |
yes, I have access to all nodes using SSH
|
|
July 23, 2009, 03:44 |
|
#4 |
Super Moderator
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 3,297
Rep Power: 41 |
passwords aren't required?
__________________
In memory of my friend Hervé: CFD engineer & freerider |
|
July 23, 2009, 14:07 |
|
#5 |
Member
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 17 |
I log into the cluster before submitting the job. so there should be no problem with password.
|
|
July 23, 2009, 14:10 |
|
#6 |
Member
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 17 |
If guys from Fluent.INC see this thread, can you take a look?
|
|
July 24, 2009, 01:19 |
|
#7 |
Super Moderator
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 3,297
Rep Power: 41 |
I am not an linux expert / fluent-parallel , but you said that you are using SSH.
SSH needs password. if you try a command on a node, what is the result? > ssh ip-address ls
__________________
In memory of my friend Hervé: CFD engineer & freerider |
|
July 24, 2009, 15:10 |
|
#8 |
Member
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 17 |
Hi, mAx, There is no problem. the output is similar to the following
bushivan@scw-097:~>ssh 192.166.2.195 ls airfoil airfoil_ins_eig default_id.dbs ... |
|
July 24, 2009, 16:35 |
|
#9 |
Super Moderator
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 3,297
Rep Power: 41 |
Is there an administrator under your Cluster or did you set it yourself?
__________________
In memory of my friend Hervé: CFD engineer & freerider |
|
July 24, 2009, 17:43 |
|
#10 |
Member
Ivan
Join Date: May 2009
Posts: 85
Rep Power: 17 |
The cluster is operated by the HPC center. I just submit my job there. I actually talked to people of HPC, and they said something like the MPI used by the clusters is not compatible with Fluent, but I am not sure if they are right. So I just post my problem here and see if anyone has encountered the same problem.
|
|
July 25, 2009, 02:15 |
|
#11 |
Super Moderator
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 3,297
Rep Power: 41 |
Then they are the right people to solve your problem.
But it may be sad not being able to use fluent's parallel enhancement with this cluster
__________________
In memory of my friend Hervé: CFD engineer & freerider |
|
March 10, 2010, 15:13 |
|
#12 |
New Member
Ibad Kureshi
Join Date: Mar 2010
Posts: 5
Rep Power: 16 |
i know this is late, but you have to give the '-ssh' in the fluent command line in the submission file. that forces fluent to use ssh rather than rsh which it always goes to by default
|
|
November 16, 2016, 13:38 |
Running job on HPC
|
#13 |
New Member
Attaullah
Join Date: Aug 2016
Posts: 23
Rep Power: 10 |
Hi every one,
I have accessed to the super computer facility in my univeristy. I have also a node on which i have to run my simulation , but the problem is, how can i setup my case using commands. I want linux to read my mesh or case file , but i am facing problems. Please help me in this reagrd |
|
March 6, 2017, 00:39 |
|
#14 | |
New Member
Moulish Kommu
Join Date: Jan 2017
Posts: 8
Rep Power: 9 |
Quote:
Hi, install mobaxterm, connect to your cluster and you can able to compile |
||
March 6, 2017, 02:01 |
|
#15 |
New Member
Attaullah
Join Date: Aug 2016
Posts: 23
Rep Power: 10 |
||
September 23, 2017, 19:12 |
|
#16 |
New Member
pezhman
Join Date: Aug 2015
Posts: 8
Rep Power: 11 |
||
Tags |
cluster, fluent, parallel |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
parallel fluent runs being killed at partitioing | Ben Aga | FLUENT | 3 | June 8, 2012 10:40 |
Running on Distibuted Memory linux itanium cluster | Josh | FLUENT | 0 | January 29, 2007 00:18 |
running multiple Fluent parallel jobs | Michael Bo Hansen | FLUENT | 8 | June 7, 2006 08:52 |
Fluent Parallel for Linux? | Rajil Saraswat | FLUENT | 0 | June 11, 2003 10:53 |
Time problem in parallel fluent | MZB | FLUENT | 1 | May 13, 2003 02:37 |