CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Hardware

Seeking an overview and tips to Infiniband + Ansys (Windows 10)

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree1Likes
  • 1 Post By digitalmg

Reply
 
LinkBack Thread Tools Display Modes
Old   December 1, 2017, 10:23
Default Seeking an overview and tips to Infiniband + Ansys (Windows 10)
  #1
SLC
New Member
 
Join Date: Jul 2011
Posts: 20
Rep Power: 8
SLC is on a distinguished road
Hi,

So I've got approval for the purchase of a 32-core setup for Ansys Fluent and CFX (see this thread: Hardware check: 32 core setup for CFD).

It will be 2 node setup, each node with Dual Xeon 8C CPUs.

In order to maximize performance and to maintain future expandability, I want to set these two nodes up with Infiniband (direct connection without a switch).

As I start delving into the world of Infiniband I am starting to realise it can get complicated for a non-network engineer. I will, however, have to setup Infiniband on my own (without the help of my firm's IT department).

I plan on using the Mellanox MCX353A-FCBT ConnectX-3 VPI Single-Port QSFP FDR IB PCIe card. The choice of card is limited because I have to purchase new from Dell, and its either this or a much more expensive EDR adapter.

I hope to use Windows 10 Enterprise on both of the nodes.

So far I have read that:

  • I can install these cards in a PCIe x8 slot
  • Connect them with a QSFP FDR copper cable
  • Install the Mellanox driver (update firmware if required)
  • Install Mellanox OFED for Windows (WinOF 5.35 is latest version - lists Windows 10 as compatible OS)
  • Run the OpenSM service on one of my nodes to enable the Infiniband cards to talk to each other.
This is where I then get a bit lost.


  • Apparently I do not want to use IPoIB because of high latency and high computational demand. I do want to use the native Infiniband protocol (is this called RDMA?).
  • But if I don't use IPoIB, how do I assign IP addresses to the Infiniband cards and ensure they are on a separate subnet to my normal ethernet connections?
  • How do I ensure that the native protocol is used? Or is this up to the MPI software to activate?


I currently run my Fluent and CFX runs on a parallel distributed setup using HP Platform MPI 9.1.4.2. As far as I can tell, HP Platform MPI 9.1.4.2 does not support WinOF versions past 2.1. Thus, I will have to switch to Intel MPI in order to use the latest version of WinOF (5.35).


I appreciate any tips and pointers you may have
SLC is offline   Reply With Quote

Old   December 1, 2017, 16:15
Default
  #2
Senior Member
 
Join Date: Mar 2009
Location: Austin, TX
Posts: 151
Rep Power: 11
kyle is on a distinguished road
If you're not using IPoIB, then there are no IP addresses associated with the Infiniband cards. Fluent and whatever MPI you end up using will use the ethernet connection to negotiate the RDMA connection.

It should be pretty easy. Just plug in the cable, install WinOF, fire up an OpenSM instance and start Fluent. The correct MPI library should be selected automatically.
kyle is offline   Reply With Quote

Old   December 2, 2017, 12:58
Default
  #3
SLC
New Member
 
Join Date: Jul 2011
Posts: 20
Rep Power: 8
SLC is on a distinguished road
Quote:
Originally Posted by kyle View Post
If you're not using IPoIB, then there are no IP addresses associated with the Infiniband cards. Fluent and whatever MPI you end up using will use the ethernet connection to negotiate the RDMA connection.

It should be pretty easy. Just plug in the cable, install WinOF, fire up an OpenSM instance and start Fluent. The correct MPI library should be selected automatically.
Thanks for your reply.

So is this the wrong way to set things up? (Looks like it uses IPoIB because he sets up IPv4 addresses for the Infiniband cards?): NEW TUTORIAL: setting 2-node cluster with infiniband (WIN7)
SLC is offline   Reply With Quote

Old   December 4, 2017, 10:49
Default
  #4
Senior Member
 
Erik
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 708
Rep Power: 13
evcelica is on a distinguished road
You can set up IPoIB, and assign IP addresses. It will just be there in addition to the native Infiniband connection. I can transfer files over my IPoIB network with double the speed of the ethernet. But for solver, it uses the native Infiniband path, not TCP. You can check by opening Task Manager >> Networking, and it will show no traffic on ethernet or IPoIB during the solve, as everything is on the native infiniband line.
evcelica is offline   Reply With Quote

Old   December 4, 2017, 11:18
Default
  #5
SLC
New Member
 
Join Date: Jul 2011
Posts: 20
Rep Power: 8
SLC is on a distinguished road
Quote:
Originally Posted by evcelica View Post
You can set up IPoIB, and assign IP addresses. It will just be there in addition to the native Infiniband connection. I can transfer files over my IPoIB network with double the speed of the ethernet. But for solver, it uses the native Infiniband path, not TCP. You can check by opening Task Manager >> Networking, and it will show no traffic on ethernet or IPoIB during the solve, as everything is on the native infiniband line.
Ah ok, cool. My nodes will already be connected over 10 GbE.

Assuming my nodes are setup with both 10 GbE and native infiniband, but not IPoIB, will I just use the Ethernet based hostnames/IP-addresses of my nodes to initiate the run in CFX/Fluent, and then the MPI will initiate the native connection over the infiniband driver?
SLC is offline   Reply With Quote

Old   December 5, 2017, 13:58
Default
  #6
Senior Member
 
Erik
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 708
Rep Power: 13
evcelica is on a distinguished road
Yes, I just use the computer names, and it uses native infiniband.
If i specify the IP address of the IPoIB, it uses TCP over infiniband which sucks.
If I specify the IP address of the gigabit LAN, it uses the standard gigabit network.
If I turn off the infiniband connection and use the computer name, it uses the standard gigabit.
It checks in a certain order if a connection is available or not, and moves to the next one on the list. You can specify interconnect order in the environmental variables.
evcelica is offline   Reply With Quote

Old   December 8, 2017, 04:17
Default
  #7
New Member
 
M-G
Join Date: Apr 2016
Posts: 17
Rep Power: 3
digitalmg is on a distinguished road
I could run Ansys Fluent with MS-MPI 8.1.1 successfully in Shared Memory on Local Machine. You cannot run it out of box on Ansys 18.2 version without some modifications:

Intel MPI environment path variables should be removed to prevent conflict of MS-MPI and Intel MPI mpiexec.exe program or uninstall Intel MPI.
MS-MPI run-time should manually be copied to Ansys defined folder.

But I couldn't run in Distributed Memory on Cluster even with-out Infiniband.
If it works, then there would be no more problem for implementation of Ansys Fluent on Windows 10 with Infiniband Mellanox latest drivers.
digitalmg is offline   Reply With Quote

Old   December 8, 2017, 09:38
Default
  #8
New Member
 
M-G
Join Date: Apr 2016
Posts: 17
Rep Power: 3
digitalmg is on a distinguished road
Hi guys,
Finally I could run MS-MPI on Windows 10 with infiniband Mellanox ConnectX-3 on clusters for ANSYS Fluent.

Don't look for other MPIs as they are not supported for Infiniband on Windows since long ago.
First, check that Intel MPI is not in PATH (environment variables) as both of Intel and MS uses mpiexec.exe file name.

You could use the MPI packed with HPC 2012R2 but you should run smpd manually.
It works fine without any major change. but in host file, never use this format:
node01:4
node02:4

instead use format like below for MS-MPI
node01 4
node02 4

If you like to use MS-MPI 8.1.1, there is a service that could be run automatically and you are free from initiating for smpd each time called "MS-MPI Lunch Service". But you should add "Bin" folder from "C:\Program Files\Microsoft MPI\Bin" to "C:\Program Files\ANSYS Inc\v182\fluent\fluent18.2.0\multiport\mpi\win64\m s" in order to make it work on every node.
of-course you should create "ms" folder yourself.

MS-MPI packed with HPC 2016 is not tested yet.
Good luck.
hpvd likes this.

Last edited by digitalmg; December 8, 2017 at 14:00.
digitalmg is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -4. The time now is 05:23.