CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > ANSYS > CFX

CentOS kernel update 3.10.0-862.11.6 breaks MPI

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   August 22, 2018, 09:10
Default CentOS kernel update 3.10.0-862.11.6 breaks MPI
  #1
New Member
 
Join Date: Aug 2018
Location: Germany
Posts: 1
Rep Power: 0
Michael Rath is on a distinguished road
Dear forum users,

I just realized, that updating to the newest CentOS kernel version 3.10.0-862.11.6 breaks the MPI implementation nearly completely.

I get the following messages in 19.1 with the different run modes:

Intel MPI Local Parallel
works.

IBM MPI Local Parallel
doesn't work:
Code:
solver-mpi.exe: Rank 0:0: MPI_Init_thread: ibv_modify_qp(2rtr) failed
solver-mpi.exe: Rank 0:0: MPI_Init_thread: ibv_modify_qpstate() failed
solver-mpi.exe: Rank 0:0: MPI_Init_thread: Internal Error: Processes cannot connect to rdma device
solver-mpi.exe: Rank 0:1: MPI_Init_thread: ibv_modify_qp(2rtr) failed
solver-mpi.exe: Rank 0:1: MPI_Init_thread: ibv_modify_qpstate() failed
solver-mpi.exe: Rank 0:1: MPI_Init_thread: Internal Error: Processes cannot connect to rdma device
MPI Application rank 0 exited before MPI_Finalize() with status 1
An error has occurred in cfx5solve:

The ANSYS CFX solver exited with return code 1.   No results file has been
created.
Intel MPI Distributed Parallel:
doesn't work:
Code:
/etc/tmi.conf: No such file or directory
/etc/tmi.conf: No such file or directory
/etc/tmi.conf: No such file or directory
/etc/tmi.conf: No such file or directory
[../../src/mpid/ch3/channels/nemesis/netmod/ofa/ofa_init.c:2099] error(22): Could not modify boot qp to RTR
[../../src/mpid/ch3/channels/nemesis/netmod/ofa/ofa_init.c:2099] error(22): Could not modify boot qp to RTR
[../../src/mpid/ch3/channels/nemesis/netmod/ofa/ofa_init.c:2099] error(22): Could not modify boot qp to RTR
[../../src/mpid/ch3/channels/nemesis/netmod/ofa/ofa_init.c:2099] error(22): Could not modify boot qp to RTR
An error has occurred in cfx5solve:

The ANSYS CFX solver exited with return code 1.   No results file has been
created.
IBM MPI Distributed Parallel
doesn't work:
Code:
 cfx5remote: Rank 0:2: MPI_Init_thread: ibv_modify_qp(2rtr) failed
cfx5remote: Rank 0:2: MPI_Init_thread: ibv_modify_qpstate() failed
cfx5remote: Rank 0:2: MPI_Init_thread: Internal Error: Processes cannot connect to rdma device
cfx5remote: Rank 0:3: MPI_Init_thread: ibv_modify_qp(2rtr) failed
cfx5remote: Rank 0:3: MPI_Init_thread: ibv_modify_qpstate() failed
cfx5remote: Rank 0:3: MPI_Init_thread: Internal Error: Processes cannot connect to rdma device
cfx5remote: Rank 0:0: MPI_Init_thread: ibv_modify_qp(2rtr) failed
cfx5remote: Rank 0:0: MPI_Init_thread: ibv_modify_qpstate() failed
cfx5remote: Rank 0:0: MPI_Init_thread: Internal Error: Processes cannot connect to rdma device
An error has occurred in cfx5remote on itsm-clust827:

/ansys_inc/v191/CFX/bin/linux-amd64/ifort/solver-mpi.exe exited with return
code 1.

An error has occurred in cfx5remote on itsm-clust827:

/ansys_inc/v191/CFX/bin/linux-amd64/ifort/solver-mpi.exe exited with return
code 1.

MPI Application rank 0 exited before MPI_Finalize() with status 2
An error has occurred in cfx5remote on itsm-clust819:

/ansys_inc/v191/CFX/bin/linux-amd64/ifort/solver-mpi.exe exited with return
code 1.

An error has occurred in cfx5remote on itsm-clust819:

/ansys_inc/v191/CFX/bin/linux-amd64/ifort/solver-mpi.exe exited with return
code 1.

An error has occurred in cfx5solve:

The ANSYS CFX solver exited with return code 2.   No results file has been
created.
Going back to 18.2 solves the problem but only for Intel MPI Distributed Parallel.

Is anybody else experiencing this problem and knows of a workaround? I know CentOS is only officially supported up to version 7.4, but until now it worked flawlessly in 7.6 and 7.4 is EOL since last year already.

Does anybody know what changed in this kernel update that could break MPI?

Regards

Michael
Michael Rath is offline   Reply With Quote

Reply

Tags
mpi error kernel update


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
[OpenFOAM.com] Compile OpenFoam using Intel ICC on OpenLogic Centos 7.3 for Intel MPI and INFINIBAND kishoremg040 OpenFOAM Installation 1 May 6, 2018 13:21
[OpenFOAM.org] OpenFOAM 3.01 on Centos 5.11 - MPI Issue vmgbritt OpenFOAM Installation 4 September 15, 2016 14:42
Sgimpi pere OpenFOAM 27 September 24, 2011 07:57
Error using LaunderGibsonRSTM on SGI ALTIX 4700 jaswi OpenFOAM 2 April 29, 2008 10:54
Is Testsuite on the way or not lakeat OpenFOAM Installation 6 April 28, 2008 11:12


All times are GMT -4. The time now is 08:06.