CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Running, Solving & CFD (https://www.cfd-online.com/Forums/openfoam-solving/)
-   -   Scaling Problems on Cluster with MVAPICH2 (https://www.cfd-online.com/Forums/openfoam-solving/172854-scaling-problems-cluster-mvapich2.html)

pilotcorky June 7, 2016 14:44

Scaling Problems on Cluster with MVAPICH2
 
3 Attachment(s)
Hi all,

We are running a custom solver implemented in foam-extend-3.1 on the Stampede supercomputer. For quite a while now, we've been trying to narrow down a really, really bad parallel scaling problem. Our installation is compiled using MVAPICH2, the only MPI library supported on Stampede as far as I can tell.

We have a test case which takes about 8 minutes to run on 16 cores. When we run the same case on 128 cores, the runtime is around 1.25 hours. I've done some profiling (I'll try to upload the results), and it looks like on the 128 core run, we are getting really stuck setting a couple of memory addresses over and over again (function call is __intel_memset). I've tried tuning the MVAPICH2 settings, and managed to get the runtime down to 45 minutes. But.... That's still pretty messed up.

A different case scales very well on 96 cores, with a slightly larger mesh and no other real differences. The case we're having issues with runs about 20,000 mesh cells per core.

Any insight would be appreciated, I'm completely out of ideas at this point....

Cheers,
Gabe

pilotcorky June 9, 2016 12:56

Update... Immediately after posting this, I realized that the issue only occurs when we are using mesh motion!

Anyone else experience a similar issue when running a dynamic mesh in parallel? Here's the dynamicMeshDict from the case, if it helps...

Code:

/*--------------------------------*- C++ -*----------------------------------*\
| =========                |                                                |
| \\          /  F ield        | OpenFOAM: The Open Source CFD Toolbox          |
|  \\    /  O peration    | Version:  1.5                                  |
|  \\  /    A nd          | Web:        http://www.OpenFOAM.org              |
|    \\/    M anipulation  |                                                |
\*---------------------------------------------------------------------------*/
FoamFile
{
    version    2.0;
    format        ascii;
    class        dictionary;
    object        motionProperties;
}
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

dynamicFvMesh dynamicMotionSolverFvMesh;
solver laplace;
diffusivity quadratic;


distancePatches 0 ();
frozenDiffusion yes;
// ************************************************************************* //


Thanks, and sorry if I sound like a "noob"... I am one :)

pilotcorky October 4, 2016 21:21

Thought I'd try reviving this thread one more time...

I'm still unable to figure out why the dynamicMotionSolverFvMesh library scales so poorly. It doesn't appear to be sensitive to system / interconnect / compiler types, which is what I initially thought it might be.

Has anyone else experienced a similar issue with this library?

Cheers :)

blais.bruno October 5, 2016 10:34

Did you test if the poor scaling is observed for all matrix solvers ?
It seems that when you move the mesh, or at least execute one of the routines of dynamicMotionSolverFvMesh, you make a call to rebuild something related to the mesh topology or etc and this is what is very detrimental to parallel efficiency.
I know for example that using dynamic mesh refinement destroys the parallelism...

This is very interesting to me, so please keep us informed! Sorry I cannot give a good practical answer...


Quote:

Originally Posted by pilotcorky (Post 620304)
Thought I'd try reviving this thread one more time...

I'm still unable to figure out why the dynamicMotionSolverFvMesh library scales so poorly. It doesn't appear to be sensitive to system / interconnect / compiler types, which is what I initially thought it might be.

Has anyone else experienced a similar issue with this library?

Cheers :)


pilotcorky October 8, 2016 16:06

Quote:

Originally Posted by blais.bruno (Post 620382)
Did you test if the poor scaling is observed for all matrix solvers ?

I did not, that's a good idea... I'll report back. I can observe the scaling problem well before the time loop is even started, though.

If there wasn't a good speedup, that'd be one thing, but seeing a tremendous slowdown like this seems suspicious... The non "extend" version of OpenFOAM has a fix for parallel communication in the latest release that seems applicable to this issue. Thoughts?

blais.bruno October 17, 2016 09:23

Slowdown compared to serial execution can occur.

I remember running a case with dynamic mesh refinement where I used the GAMG matrix solver. I recall that running the case with 8 processors took about 1.5x time more time than with a single processor. Using dynamic mesh refinement across multiple processor every iteration caused a dramatic decrease in performance due to the numerous communication, but also due to the fact that some pre-caching the GAMG solver could not be used anymore.

In all cases, this is surprising, but I am just saying it can occur!

Good luck :)!


Quote:

Originally Posted by pilotcorky (Post 620777)
I did not, that's a good idea... I'll report back. I can observe the scaling problem well before the time loop is even started, though.

If there wasn't a good speedup, that'd be one thing, but seeing a tremendous slowdown like this seems suspicious... The non "extend" version of OpenFOAM has a fix for parallel communication in the latest release that seems applicable to this issue. Thoughts?



All times are GMT -4. The time now is 10:08.