CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   FLUENT (http://www.cfd-online.com/Forums/fluent/)
-   -   MPI problem with fluent (http://www.cfd-online.com/Forums/fluent/81070-mpi-problem-fluent.html)

aryanet October 15, 2010 01:32

MPI problem with fluent
 
Hi there,
I am new to run fluent in linux centOS. I have installed fluent 6.3 on three machines. but when I run the command below:
/data/Fluent.Inc/bin/fluent -g 3d -cnf=/root/host -t4

It ends up with the following output:

Code:

[root@MDS1 ~]# /data/Fluent.Inc/bin/fluent -g 3d -cnf=/root/host -t4
/data/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 -g 3d -cnf=/root/host -t4
/data/Fluent.Inc/fluent6.3.26/cortex/lnamd64/cortex.3.7.3 -f fluent -g (fluent "3d -pethernet  -host -r6.3.26 -t4 -mpi=hp -cnf=/root/host -path/data/Fluent.Inc")
Loading "/data/Fluent.Inc/fluent6.3.26/lib/fluent.dmp.114-64"
Done.
/data/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 3d -pethernet -host -t4 -mpi=hp -cnf=/root/host -path/data/Fluent.Inc -cx MDS1:47715:35420
Starting /data/Fluent.Inc/fluent6.3.26/lnamd64/3d_host/fluent.6.3.26 host -cx MDS1:47715:35420 "(list (rpsetvar (QUOTE parallel/function) "fluent 3d -node -r6.3.26 -t4 -pethernet -mpi=hp -cnf=/root/host ") (rpsetvar (QUOTE parallel/rhost) "") (rpsetvar (QUOTE parallel/ruser) "") (rpsetvar (QUOTE parallel/nprocs_string) "4") (rpsetvar (QUOTE parallel/auto-spawn?) #t) (rpsetvar (QUOTE parallel/trace-level) 0) (rpsetvar (QUOTE parallel/remote-shell) 0) (rpsetvar (QUOTE parallel/path) "/data/Fluent.Inc") (rpsetvar (QUOTE parallel/hostsfile) "/root/host") )"

    Welcome to Fluent 6.3.26

    Copyright 2006 Fluent Inc.
    All Rights Reserved

Loading "/data/Fluent.Inc/fluent6.3.26/lib/flprim.dmp.1119-64"
Done.

Host spawning Node 0 on machine "MDS1" (unix).
/data/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 3d -node -t4 -pethernet -mpi=hp -cnf=/root/host -mport 127.0.0.1:127.0.0.1:45271:0
Starting /data/Fluent.Inc/fluent6.3.26/multiport/mpi/lnamd64/hp/bin/mpirun -TCP -f /tmp/fluent-appfile.16803
mpirun: No route to host
mpirun: Bad file descriptor

I don't know what is wrong with hpmpi!? :(

elbasharat October 22, 2010 16:02

the error is not with your mpi but with the parallel connectivity. check your ssh or rsh then run it again and make sure to stop the firewall. tell me then if you get any error again.

aryanet October 23, 2010 16:39

Quote:

Originally Posted by elbasharat (Post 280390)
the error is not with your mpi but with the parallel connectivity. check your ssh or rsh then run it again and make sure to stop the firewall. tell me then if you get any error again.

Wooow! thanx, It was a problem with the firewall. But after solving that another problem comes up:

Code:

[root@MDS1 bin]# /data/Fluent.Inc/bin/fluent -g 3d -cnf=/root/host -t2
/data/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 -g 3d -cnf=/root/host -t2
/data/Fluent.Inc/fluent6.3.26/cortex/lnamd64/cortex.3.7.3 -f fluent -g (fluent "3d -pethernet  -host -r6.3.26 -t2 -mpi=hp -cnf=/root/host -path/data/Fluent.Inc")
Loading "/data/Fluent.Inc/fluent6.3.26/lib/fluent.dmp.114-64"
Done.
/data/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 3d -pethernet -host -t2 -mpi=hp -cnf=/root/host -path/data/Fluent.Inc -cx MDS1:44145:40097
Starting /data/Fluent.Inc/fluent6.3.26/lnamd64/3d_host/fluent.6.3.26 host -cx MDS1:44145:40097 "(list (rpsetvar (QUOTE parallel/function) "fluent 3d -node -r6.3.26 -t2 -pethernet -mpi=hp -cnf=/root/host ") (rpsetvar (QUOTE parallel/rhost) "") (rpsetvar (QUOTE parallel/ruser) "") (rpsetvar (QUOTE parallel/nprocs_string) "2") (rpsetvar (QUOTE parallel/auto-spawn?) #t) (rpsetvar (QUOTE parallel/trace-level) 0) (rpsetvar (QUOTE parallel/remote-shell) 0) (rpsetvar (QUOTE parallel/path) "/data/Fluent.Inc") (rpsetvar (QUOTE parallel/hostsfile) "/root/host") )"

    Welcome to Fluent 6.3.26

    Copyright 2006 Fluent Inc.
    All Rights Reserved

Loading "/data/Fluent.Inc/fluent6.3.26/lib/flprim.dmp.1119-64"
Done.

Host spawning Node 0 on machine "MDS1" (unix).
/data/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 3d -node -t2 -pethernet -mpi=hp -cnf=/root/host -mport 127.0.0.1:127.0.0.1:37906:0
Starting /data/Fluent.Inc/fluent6.3.26/multiport/mpi/lnamd64/hp/bin/mpirun -TCP -f /tmp/fluent-appfile.16143
HP-MPI licensed for execution of Fluent.

0: mpt_connect: error: connect failed: Connection refused

0: mpt_establish_connection: error: unable to connect: Illegal seek

0: mpt_connect: error: connect failed: Connection refused

0: mpt_establish_connection: error: unable to connect: Illegal seek

0: mpt_connect: error: connect failed: Connection refused

0: mpt_establish_connection: error: unable to connect: Illegal seek

0: mpt_connect_to_server: error: cannot establish connection; bye.: Illegal seek
MPI Application rank 0 exited before MPI_Finalize() with status 0

By the way, I have installed fluent on both machines!

Would you help plz?

elbasharat October 24, 2010 01:45

ok ..


you must check your proper connectivity there not dynamics IP but static.

then if you know about ssh configuration then do it for the parallel computing.

I suggest you to use the -ssh in your command to run fluent. and also do the permissive of SElinux also. sometimes it also stops the suspicious connectivity. anyways do the following and let me know. :)
cheers

aryanet October 24, 2010 01:57

Quote:

Originally Posted by elbasharat (Post 280480)
ok ..


you must check your proper connectivity there not dynamics IP but static.

then if you know about ssh configuration then do it for the parallel computing.

I suggest you to use the -ssh in your command to run fluent. and also do the permissive of SElinux also. sometimes it also stops the suspicious connectivity. anyways do the following and let me know. :)
cheers

Well, I have set all the IP addresses manually and there is no dynamic IP address.
I ran the fluent with -ssh switch, but nothing happened new.
SElinux is completely disabled also.

I've really get stuck... :(

elbasharat October 24, 2010 05:40

ok then check whether your fluent is installed proper i mean there sometimes mpi folder doesnt exist.

I didnt experience that kind of error before.

there must be some human error.

kondora October 24, 2010 14:37

Check if MPI is correctly installed. Run FLUENT with -ssh option. Before that do: cd ~; cd .ssh;
ssh-keygen -dsa; {blank passphare - ENTER, ENTER}; cat id_dsa.pub > authorized_keys2; ssh 127.0.0.1; {confirm with yes}; ssh 127.0.1.1; {confirm with yes}; check if you can ssh to 127.0.0.1 and 127.0.1.1 without typing a password. Generally: google -> "ssh without password".
Try: fluent 2d -ssh -mpi=intel when MPI is not working. Hope I helped.

kondora October 24, 2010 14:45

Sorry, i didn't read that you are using different machines, so instead of doing ssh without password to localhost, try this: http://linuxproblem.org/art_9.html

aryanet October 24, 2010 15:13

Well, I'm sure ssh has configured properly.
But, thanx from your helps.

mmkkeshavarzi December 23, 2015 13:35

Hi
since i am new in Linux,Ubuntu i have some sort of same problem
I try to open Fluent on my device, but i encounter with this error:

999999: mpt_get_dot_address: warning : UNI - SERVER _ > 127.0.0.1 check your system network configuration!

999999: mpt_get_dot_address: warning : UNI - SERVER _ > 127.0.0.1 check your system network configuration!

starting / user/ansys_inc/v150/fluent/fluent15.0.0/(.....), mpirun: rsh: command not found

if anyone can help me to get through this problem i would really appreciate it
thanks in advance

villager February 6, 2016 10:53

Your problem is possibly solved by adding
Code:

-ssh
switch to the start command/script:
Code:

fluent -ssh ....
If you really want rsh you should have it installed!
Code:

which rsh
should find it. But note, that it is often symlink to ssh. Do not know, if this strange config would work with FLUENT.

Using rsh on Ubuntu:
on each computing node:
Code:

sudo apt-get install rsh-server
sudo apt-get install rsh-client

and on the submit node (if differs from computing node)
Code:

sudo apt-get install rsh-client
Quote:

Originally Posted by mmkkeshavarzi (Post 578564)
Hi
since i am new in Linux,Ubuntu i have some sort of same problem
I try to open Fluent on my device, but i encounter with this error:

999999: mpt_get_dot_address: warning : UNI - SERVER _ > 127.0.0.1 check your system network configuration!

999999: mpt_get_dot_address: warning : UNI - SERVER _ > 127.0.0.1 check your system network configuration!

starting / user/ansys_inc/v150/fluent/fluent15.0.0/(.....), mpirun: rsh: command not found

if anyone can help me to get through this problem i would really appreciate it
thanks in advance


villager February 6, 2016 11:05

1) The first thing to try is to run without cnf option.
FLUENT would not make him wait to spawn so much process on your current machine, that you specify with -t option:
Code:

/data/Fluent.Inc/bin/fluent -g 3d -t2
2) The second thing to try is to login to target machines. I suggest using ssh.

E.g., your machines are machine1 and machine2.
Code:

ssh machine1
(should give you shell on this machine without(!!!) any confirmation - i.e., without password, yes/no and soon)
Code:

ssh machine2
(too)
Run via ssh with explicit node list (for example, we would require two processes on each machine):
Code:

/data/Fluent.Inc/bin/fluent -g 3d -t2 -ssh -cnf=machine1:2,machine2:2
3) The third thing is to use file like you did with "-ssh" option.

You could change ssh to rsh everywhere, though. I didn't use it, but I think the workaround is almost the same.

Cheers, John.

Quote:

Originally Posted by aryanet (Post 280461)
Wooow! thanx, It was a problem with the firewall. But after solving that another problem comes up:

Code:

[root@MDS1 bin]# /data/Fluent.Inc/bin/fluent -g 3d -cnf=/root/host -t2
/data/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 -g 3d -cnf=/root/host -t2
/data/Fluent.Inc/fluent6.3.26/cortex/lnamd64/cortex.3.7.3 -f fluent -g (fluent "3d -pethernet  -host -r6.3.26 -t2 -mpi=hp -cnf=/root/host -path/data/Fluent.Inc")
Loading "/data/Fluent.Inc/fluent6.3.26/lib/fluent.dmp.114-64"
Done.
/data/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 3d -pethernet -host -t2 -mpi=hp -cnf=/root/host -path/data/Fluent.Inc -cx MDS1:44145:40097
Starting /data/Fluent.Inc/fluent6.3.26/lnamd64/3d_host/fluent.6.3.26 host -cx MDS1:44145:40097 "(list (rpsetvar (QUOTE parallel/function) "fluent 3d -node -r6.3.26 -t2 -pethernet -mpi=hp -cnf=/root/host ") (rpsetvar (QUOTE parallel/rhost) "") (rpsetvar (QUOTE parallel/ruser) "") (rpsetvar (QUOTE parallel/nprocs_string) "2") (rpsetvar (QUOTE parallel/auto-spawn?) #t) (rpsetvar (QUOTE parallel/trace-level) 0) (rpsetvar (QUOTE parallel/remote-shell) 0) (rpsetvar (QUOTE parallel/path) "/data/Fluent.Inc") (rpsetvar (QUOTE parallel/hostsfile) "/root/host") )"

    Welcome to Fluent 6.3.26

    Copyright 2006 Fluent Inc.
    All Rights Reserved

Loading "/data/Fluent.Inc/fluent6.3.26/lib/flprim.dmp.1119-64"
Done.

Host spawning Node 0 on machine "MDS1" (unix).
/data/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 3d -node -t2 -pethernet -mpi=hp -cnf=/root/host -mport 127.0.0.1:127.0.0.1:37906:0
Starting /data/Fluent.Inc/fluent6.3.26/multiport/mpi/lnamd64/hp/bin/mpirun -TCP -f /tmp/fluent-appfile.16143
HP-MPI licensed for execution of Fluent.

0: mpt_connect: error: connect failed: Connection refused

0: mpt_establish_connection: error: unable to connect: Illegal seek

0: mpt_connect: error: connect failed: Connection refused

0: mpt_establish_connection: error: unable to connect: Illegal seek

0: mpt_connect: error: connect failed: Connection refused

0: mpt_establish_connection: error: unable to connect: Illegal seek

0: mpt_connect_to_server: error: cannot establish connection; bye.: Illegal seek
MPI Application rank 0 exited before MPI_Finalize() with status 0

By the way, I have installed fluent on both machines!

Would you help plz?


mmkkeshavarzi February 7, 2016 07:29

Thank you Villager, my problem solved


Quote:

Originally Posted by villager (Post 583991)
Your problem is possibly solved by adding
Code:

-ssh
switch to the start command/script:
Code:

fluent -ssh ....
If you really want rsh you should have it installed!
Code:

which rsh
should find it. But note, that it is often symlink to ssh. Do not know, if this strange config would work with FLUENT.

Using rsh on Ubuntu:
on each computing node:
Code:

sudo apt-get install rsh-server
sudo apt-get install rsh-client

and on the submit node (if differs from computing node)
Code:

sudo apt-get install rsh-client


Khunnie_baby June 21, 2016 10:46

hello, i'm now having the same problem with u, how did u solve your problem? Would u plz help me? thank u very much!

Khunnie_baby June 21, 2016 10:47

Quote:

Originally Posted by mmkkeshavarzi (Post 584057)
Thank you Villager, my problem solved

hello, i'm now having the same problem with u, how did u solve your problem? Would u plz help me? thank u very much!

Khunnie_baby June 21, 2016 22:02

Quote:

Originally Posted by elbasharat (Post 280390)
the error is not with your mpi but with the parallel connectivity. check your ssh or rsh then run it again and make sure to stop the firewall. tell me then if you get any error again.

hi,i have the same problem with parallel connective, so i checked the ssh and rsh and stopped the firewall, but the problem are still exist. would you please help me to solve this problem?


All times are GMT -4. The time now is 15:00.