CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Pre-Processing (https://www.cfd-online.com/Forums/openfoam-pre-processing/)
-   -   OpenFOAM Parallel Run Problem (https://www.cfd-online.com/Forums/openfoam-pre-processing/201706-openfoam-parallel-run-problem.html)

javier2098 May 8, 2018 17:44

OpenFOAM Parallel Run Problem
 
hello everybody,

I'm trying to run engineFoam in parallel 2 machines(32 cores), in both i have install OpenFOAM v5, a shared file system(NFS), same username, ssh communication without password, also i can run normally the case in each machine.

I've been digging in this forum, and try all the solutions i have found, but nothing seems to do the trick. the problem its that when i lunch the case like this:
Code:

mpirun --hostfile /home/halfblood/nfsshare/nodes -np 4 engineFoam -parallel
my hostfile is:
Code:

192.xxx.xx.61 cpu = 2
192.xxx.xx.62 cpu = 2

it simply would start show this massage:
Code:

--------------------------------------------------------------------------
[[48743,1],0]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:

Module: OpenFabrics (openib)
  Host: mech-02

Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
/*---------------------------------------------------------------------------*\
| =========                |                                                |
| \\      /  F ield        | OpenFOAM: The Open Source CFD Toolbox          |
|  \\    /  O peration    | Version:  5.x                                  |
|  \\  /    A nd          | Web:      www.OpenFOAM.org                      |
|    \\/    M anipulation  |                                                |
\*---------------------------------------------------------------------------*/
Build  : 5.x-197d9d3bf20a
Exec  : engineFoam -parallel
Date  : May 08 2018
Time  : 11:46:29
Host  : "mech-02"
PID    : 28023
I/O    : uncollated
[mech-02:28020] 3 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[mech-02:28020] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

the cores seams to be running at full capacity, 2 cores in each machine but in the processors folders nothing changes same size, this also happens if i put 16, 32 cores.

i try to block the other network interfaces
Code:

mpirun --mca btl_tcp_if_exclude 'eno1, eno2, lo, ens1f0, docker0, ens1f0' --hostfile /home/halfblood/nfsshare/nodes -np 16 engineFoam -parallel
the exit interface is ens1f1. Same problem.

also i have try to modify the decomposeParDict:
Code:

numberOfSubdomains 4;

//method          simple;
method          scotch;

//simpleCoeffs
//{
//    n              (2 2 1);
//    delta          0.001;
//}
distributed yes;

roots 3(
"/home/halfblood/nfsshare/OpenFOAM/halfblood-5.0/run/engineFoam/kivaTest"
"/home/halfblood/nfsshare/OpenFOAM/halfblood-5.0/run/engineFoam/kivaTest"
"/home/halfblood/nfsshare/OpenFOAM/halfblood-5.0/run/engineFoam/kivaTest"
)

In both machines i can access that files and have are in the same directory. Same Problem.

the case i'm running is the tutorial engine foam.

Beforehand thanks for taking the time to help me.

godfatherBond November 27, 2019 00:01

Could you limit the communication to a specified interface and check:
1. Check the interface using of the nodes using
Code:

ssh <NODE> netstat -nr
2. If you get the OUTPUT(from point 1.) interfaces like eth0, eth1, eno1,...m then limit the communication to one interface using:
Code:

--mca btl_tcp_if_include eth0
try to see if this works.


All times are GMT -4. The time now is 04:52.