Runs on multiple nodes are 2x slower than on one node

IvanPombo · January 16, 2024, 14:34

Dear All,

We have been testing locally with the usual Github installation of Reef3D in Linux machines. We have been using one of the Flow around a Circular pier tutorial with the CFD solver to test some runs with different computational configurations.

We have tested running this simulation in two different settings:
- single node: single Intel Xeon machine with 30 CPU cores.
- two nodes: two Intel Xeon machines, each with 30 CPU cores, summing up to 60 cores.

Running the same simulation on each setting, modifying only the M 10 parameter, we are obtaining an absurd computational time with the setting composed of two nodes.

In practice, the simulation on a single node took 1h30mins and on the two nodes is taking more than four hours. While we expected it to not be faster, we didn't expected it to be significantly slower. So we are doubting our installation procedure is somehow wrong to deal with simulations running on multiple nodes.

To clarify, we have tried with smaller machines and different Reef3D simulations and obtained the same behavior. Moreover, we have tested the cluster with OpenFOAM and everything seems to be working as expected.

We are using apptainer to launch the mpi job from outside the apptainer and to all the machines, and have an NFS server in one of them.

Any help is appreciated to figure this out! Thanks in advance!!

harenaobsidet · January 16, 2024, 15:40

Each case has a number of cells per partition under which the CFD calculations take less time than the communication between partitions. CPU communication speed is usually in this order: CCD > CPU > Socket >>> other computer
So you can either just use one CPU or even only parts of one (e.g. SLURM allows partial usage of a node) or increase the number of cells by decreasing the grid spacing if you want to fully utilise your machines.

Hope that helps you.

valgrinda · January 17, 2024, 10:19

Hi Ivan,

the tutorial cases have quite coarse resolution, so possibly the total cell count is too low to assure good scaling. Can you try with a large number of cells (try ca. 10 000 cells per core at a minimum). How many cells are you running OpenFOAM at?

Felix_Sp · February 7, 2024, 02:34

Hey there,

I just wanted to add something I experienced on one HPC I ran REEF3D on. Running REEF3D with OpenMPI compiled by gcc resulted in bad scaling similar to your findings.

However, after compiling REEF3D and hypre with icc and intelmpi scaling was just fine. Maybe recompiling with a different compiler might therefore solve your problem?

Good luck anyway!

January 16, 2024, 14:34	Runs on multiple nodes are 2x slower than on one node	#1
IvanPombo New Member Ivan Pombo Join Date: Jan 2024 Posts: 1 Rep Power: 0	Dear All, We have been testing locally with the usual Github installation of Reef3D in Linux machines. We have been using one of the Flow around a Circular pier tutorial with the CFD solver to test some runs with different computational configurations. We have tested running this simulation in two different settings: - single node: single Intel Xeon machine with 30 CPU cores. - two nodes: two Intel Xeon machines, each with 30 CPU cores, summing up to 60 cores. Running the same simulation on each setting, modifying only the M 10 parameter, we are obtaining an absurd computational time with the setting composed of two nodes. In practice, the simulation on a single node took 1h30mins and on the two nodes is taking more than four hours. While we expected it to not be faster, we didn't expected it to be significantly slower. So we are doubting our installation procedure is somehow wrong to deal with simulations running on multiple nodes. To clarify, we have tried with smaller machines and different Reef3D simulations and obtained the same behavior. Moreover, we have tested the cluster with OpenFOAM and everything seems to be working as expected. We are using apptainer to launch the mpi job from outside the apptainer and to all the machines, and have an NFS server in one of them. Any help is appreciated to figure this out! Thanks in advance!!

January 16, 2024, 15:40		#2
harenaobsidet Member Alexander Hanke Join Date: Dec 2023 Location: Trondheim Posts: 33 Rep Power: 3	Each case has a number of cells per partition under which the CFD calculations take less time than the communication between partitions. CPU communication speed is usually in this order: CCD > CPU > Socket >>> other computer So you can either just use one CPU or even only parts of one (e.g. SLURM allows partial usage of a node) or increase the number of cells by decreasing the grid spacing if you want to fully utilise your machines. Hope that helps you. __________________ Alexander Hanke Team REEF3D www.reef3d.com

January 17, 2024, 10:19		#3
valgrinda Super Moderator Hans Bihs Join Date: Jun 2009 Location: Trondheim, Norway Posts: 395 Rep Power: 18	Hi Ivan, the tutorial cases have quite coarse resolution, so possibly the total cell count is too low to assure good scaling. Can you try with a large number of cells (try ca. 10 000 cells per core at a minimum). How many cells are you running OpenFOAM at? __________________ Hans Bihs Team REEF3D www.reef3d.com

February 7, 2024, 02:34		#4
Felix_Sp Member Felix S. Join Date: Feb 2021 Location: Germany, Braunschweig Posts: 87 Rep Power: 6	Hey there, I just wanted to add something I experienced on one HPC I ran REEF3D on. Running REEF3D with OpenMPI compiled by gcc resulted in bad scaling similar to your findings. However, after compiling REEF3D and hypre with icc and intelmpi scaling was just fine. Maybe recompiling with a different compiler might therefore solve your problem? Good luck anyway! Last edited by Felix_Sp; February 7, 2024 at 12:43.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Openfoam running extremely slowly using multiple nodes on HPC	wdx_cfd	OpenFOAM Running, Solving & CFD	3	September 19, 2023 05:17
Hybrid OpenmMP+MPI optimisation	geronimo_750	SU2	8	April 6, 2022 07:29
[ICEM] Error in mesh writing	helios	ANSYS Meshing & Geometry	21	August 19, 2021 14:18
problem with interfoam on multiple nodes	gkara	OpenFOAM Running, Solving & CFD	1	July 6, 2016 09:17
CFX4.3 -build analysis form	Chie Min	CFX	5	July 12, 2001 23:19