CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Running, Solving & CFD (https://www.cfd-online.com/Forums/openfoam-solving/)
-   -   OpenFOAM MPI doesn't work on LAN due to "missing" executables (https://www.cfd-online.com/Forums/openfoam-solving/166211-openfoam-mpi-doesnt-work-lan-due-missing-executables.html)

NablaDyn February 4, 2016 11:31

OpenFOAM MPI doesn't work on LAN due to "missing" executables
 
Hello everybody,:)

I tried to set up a small OpenFOAM cluster via LAN. MPI works just fine when I run it locally, say on 6 cores. But when I try to run it on LAN it always fails independently from the machine I want to use for parallelization (I have three machines connected to a switch). The error message always says:

mpirun was unable to launch the specified application as it could not find an executable:

Executable: simpleFoam
Node: 192.168.1.2

while attempting to start process rank 6.


But OF is properly installed. I can ssh into all machines without any problems.

Can someone give me a hint or suggestion?

Thanks in advance!

Regards

NablaDyn February 6, 2016 13:39

OpenFOAM shows up in task manager but doesn't seem to compute solution
 
I managed to export the environment variables, so the run seems to start now without error messages.

But now I recognized that although the processes show up in the task managers on all the used nodes with about 99 % CPU usage, the solution don't seem to be computed. The log file only contains the OpenFOAM header sequence but no convergence history. Furthermore the processes seem to run infinitely.

Hope someone can help me...

Thanks in advance!

wyldckat February 6, 2016 17:26

Quick answer: Quoting from a blog post of mine: Notes about running OpenFOAM in parallel
Quote:

Is the output from mpirun (Open-MPI) only coming out at the end of the run? Check this post: mpirun openfoam output is buffered, only output at the end post #9

NablaDyn February 8, 2016 04:13

Quote:

Originally Posted by wyldckat (Post 584029)
Quick answer: Quoting from a blog post of mine: Notes about running OpenFOAM in parallel

Thank you very much for your help! Unfortunately your hint did not solve the problem. So here is what I'm trying to do in detail:

I have three nodes on a LAN with 6, 2 and 2 cores. I want to initialize a simplefoam run on 4, 2 an 2 cores at a time using Open MPI 1.6.5. To do so I decomposed the domain using
Code:

decomposePar
into 8 sub domains without error.

The required machine file named "machine" is located within the "system" directory of the OF case and contains the following:

Code:

ihgg-ubuntu cpu=4
cluster_login@192.168.1.2 cpu=2
cfd_cluster@192.168.1.3 cpu=2

The first node is the six-core master machine. The two below are the two-core slaves. The user names are the admistrator logins for each machine. I established passwordless ssh from and to all nodes within the LAN, which works perfectly fine.

The mpirun command only "successfully" initializes simplefoam if i export the OF environment variables:

Code:

mpirun -x LD_LIBRARY_PATH -x PATH -x WM_PROJECT_DIR -x WM_PROJECT_INST_DIR -x WM_OPTIONS -x FOAM_LIBBIN -x FOAM_APPBIN -x MPI_BUFFER_SIZE -machinefile system/machine -np 8 -output-filename openfoam_log 'simpleFoam' -parallel &
From this point on, simplefoam shows up in the task managers of all nodes with the correct number of processes. But when I open the fresh log file, it shows no convergence history but one error message.

Code:

/*---------------------------------------------------------------------------*\
| =========                |                                                |
| \\      /  F ield        | OpenFOAM: The Open Source CFD Toolbox          |
|  \\    /  O peration    | Version:  3.0.1                                |
|  \\  /    A nd          | Web:      www.OpenFOAM.org                      |
|    \\/    M anipulation  |                                                |
\*---------------------------------------------------------------------------*/
Build  : 3.0.1-119cac7e8750
Exec  : simpleFoam -parallel
Date  : Feb 08 2016
Time  : 10:08:50
Host  : "ihgg-ubuntu"
PID    : 13154
[ihgg-ubuntu][[8881,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect] connect() to 10.3.140.137 failed: No route to host (113)

Is the error message caused by the connection attempt to ip 10.3.140.137?

I'm totally stuck:(.

Thanks in advance for any further suggestions!

NablaDyn March 15, 2016 05:15

I put my problem aside for quite awhile. Now that I got to review the whole issue I solved my problem. The unknown IP ("no route to host") resulted from an active WLAN connection on the remote node (client). OpenMPI tried to use this connection instead of the local ethernet IPs assigned by the
Code:

machinefile
in conjunction with the
Code:

/etc/hosts
file. Deactivating WLAN solved it partially. My entire solution procedure was as follows:
  1. Made sure all nodes are accessible via password-less ssh and the host can also be accessed in the same manner from all the client nodes.
  2. Put the
    Code:

    source /opt/openfoam...
    line into the bashrc-files on all the nodes (OpenFOAM install dir is the same on all nodes).
  3. Use the MPI command in the following manner:
    Code:

    mpirun -hostfile <hostfile> -np <no. of processors> /opt/openfoam30/bin/foamExec icoFoam -parallel
Although this got me way further I was then struggeling with the proper case decomposition over the LAN. Thus, I ended up using "manually" distributed data (run decomposePar nondistributed on the hosting node and distribute the <processorX> directories manually according to the node sequence and cpu numbers in the OpenMPI hostfile). Data access for the client nodes can then conveniently be established e.g. by using the "Connect to Server" (via GUI) functionality of the "Files" application.

I consider this topic as CLOSED.

wyldckat March 19, 2016 15:43

Quick answer: Many thanks for posting your solution!

This reminded me of the following solution for disabling certain network interfaces:
Quote:

Originally Posted by pkr (Post 292700)
When using MPI_reduce, the OpenMPI was trying to establish TCP through a different interface. The problem is solved if the following command is used:
Code:

mpirun --mca btl_tcp_if_exclude lo,virbr0  -hostfile machines -np 2  /home/rphull/OpenFOAM/OpenFOAM-1.6/bin/foamExec interFoam -parallel
The above command will restrict MPI to use certain networks (lo, vibro in this case).


NablaDyn March 21, 2016 06:26

Thanks for your hint wyldckat. Actually I have tried this approach without success. But maybe because of a typo or such XD...

Rvadrabade April 21, 2018 09:53

How to set up password-less ssh on ubuntu 16.04 lts ?
Also i have different if addresses and not similar to 192.x.x.x. Then how can i use other system cpu? any suggestions

wyldckat April 22, 2018 13:28

Quick answers:
Quote:

Originally Posted by Rvadrabade (Post 689755)
How to set up password-less ssh on ubuntu 16.04 lts ?

Search online for the following phrase:
Code:

How To Set Up SSH Keys
Quote:

Originally Posted by Rvadrabade (Post 689755)
Also i have different if addresses and not similar to 192.x.x.x. Then how can i use other system cpu? any suggestions

Yes, you can, but the machines should be accessible on the same network, i.e. you should be able to ping them, e.g.:
Code:

ping 192.1.2.3
Edit your "/etc/hosts" file and add an entry for each machine, on each machine, so that you can ping/ssh/mpirun them by name, instead of IP address.


All times are GMT -4. The time now is 23:10.