CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

Scaling Problems on Cluster with MVAPICH2

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   June 7, 2016, 14:44
Unhappy Scaling Problems on Cluster with MVAPICH2
  #1
New Member
 
Join Date: Jun 2016
Location: Amherst, MA
Posts: 5
Rep Power: 7
pilotcorky is on a distinguished road
Hi all,

We are running a custom solver implemented in foam-extend-3.1 on the Stampede supercomputer. For quite a while now, we've been trying to narrow down a really, really bad parallel scaling problem. Our installation is compiled using MVAPICH2, the only MPI library supported on Stampede as far as I can tell.

We have a test case which takes about 8 minutes to run on 16 cores. When we run the same case on 128 cores, the runtime is around 1.25 hours. I've done some profiling (I'll try to upload the results), and it looks like on the 128 core run, we are getting really stuck setting a couple of memory addresses over and over again (function call is __intel_memset). I've tried tuning the MVAPICH2 settings, and managed to get the runtime down to 45 minutes. But.... That's still pretty messed up.

A different case scales very well on 96 cores, with a slightly larger mesh and no other real differences. The case we're having issues with runs about 20,000 mesh cells per core.

Any insight would be appreciated, I'm completely out of ideas at this point....

Cheers,
Gabe
Attached Images
File Type: png 16coreRun.png (112.7 KB, 15 views)
File Type: png 128 core showing memset time consumption.png (85.7 KB, 12 views)
File Type: png memset breakdown.png (53.6 KB, 11 views)
pilotcorky is offline   Reply With Quote

Old   June 9, 2016, 12:56
Default
  #2
New Member
 
Join Date: Jun 2016
Location: Amherst, MA
Posts: 5
Rep Power: 7
pilotcorky is on a distinguished road
Update... Immediately after posting this, I realized that the issue only occurs when we are using mesh motion!

Anyone else experience a similar issue when running a dynamic mesh in parallel? Here's the dynamicMeshDict from the case, if it helps...

Code:
/*--------------------------------*- C++ -*----------------------------------*\
| =========                 |                                                 |
| \\	  /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  1.5                                   |
|   \\  /    A nd           | Web:	http://www.OpenFOAM.org               |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
FoamFile
{
    version     2.0;
    format	ascii;
    class	dictionary;
    object	motionProperties;
}
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

dynamicFvMesh dynamicMotionSolverFvMesh;
solver laplace;
diffusivity quadratic;


distancePatches 0 ();
frozenDiffusion yes;
// ************************************************************************* //

Thanks, and sorry if I sound like a "noob"... I am one
pilotcorky is offline   Reply With Quote

Old   October 4, 2016, 21:21
Default
  #3
New Member
 
Join Date: Jun 2016
Location: Amherst, MA
Posts: 5
Rep Power: 7
pilotcorky is on a distinguished road
Thought I'd try reviving this thread one more time...

I'm still unable to figure out why the dynamicMotionSolverFvMesh library scales so poorly. It doesn't appear to be sensitive to system / interconnect / compiler types, which is what I initially thought it might be.

Has anyone else experienced a similar issue with this library?

Cheers
pilotcorky is offline   Reply With Quote

Old   October 5, 2016, 10:34
Default
  #4
Member
 
Bruno Blais
Join Date: Sep 2013
Location: Canada
Posts: 64
Rep Power: 10
blais.bruno is on a distinguished road
Did you test if the poor scaling is observed for all matrix solvers ?
It seems that when you move the mesh, or at least execute one of the routines of dynamicMotionSolverFvMesh, you make a call to rebuild something related to the mesh topology or etc and this is what is very detrimental to parallel efficiency.
I know for example that using dynamic mesh refinement destroys the parallelism...

This is very interesting to me, so please keep us informed! Sorry I cannot give a good practical answer...


Quote:
Originally Posted by pilotcorky View Post
Thought I'd try reviving this thread one more time...

I'm still unable to figure out why the dynamicMotionSolverFvMesh library scales so poorly. It doesn't appear to be sensitive to system / interconnect / compiler types, which is what I initially thought it might be.

Has anyone else experienced a similar issue with this library?

Cheers
blais.bruno is offline   Reply With Quote

Old   October 8, 2016, 16:06
Default
  #5
New Member
 
Join Date: Jun 2016
Location: Amherst, MA
Posts: 5
Rep Power: 7
pilotcorky is on a distinguished road
Quote:
Originally Posted by blais.bruno View Post
Did you test if the poor scaling is observed for all matrix solvers ?
I did not, that's a good idea... I'll report back. I can observe the scaling problem well before the time loop is even started, though.

If there wasn't a good speedup, that'd be one thing, but seeing a tremendous slowdown like this seems suspicious... The non "extend" version of OpenFOAM has a fix for parallel communication in the latest release that seems applicable to this issue. Thoughts?
pilotcorky is offline   Reply With Quote

Old   October 17, 2016, 09:23
Default
  #6
Member
 
Bruno Blais
Join Date: Sep 2013
Location: Canada
Posts: 64
Rep Power: 10
blais.bruno is on a distinguished road
Slowdown compared to serial execution can occur.

I remember running a case with dynamic mesh refinement where I used the GAMG matrix solver. I recall that running the case with 8 processors took about 1.5x time more time than with a single processor. Using dynamic mesh refinement across multiple processor every iteration caused a dramatic decrease in performance due to the numerous communication, but also due to the fact that some pre-caching the GAMG solver could not be used anymore.

In all cases, this is surprising, but I am just saying it can occur!

Good luck !


Quote:
Originally Posted by pilotcorky View Post
I did not, that's a good idea... I'll report back. I can observe the scaling problem well before the time loop is even started, though.

If there wasn't a good speedup, that'd be one thing, but seeing a tremendous slowdown like this seems suspicious... The non "extend" version of OpenFOAM has a fix for parallel communication in the latest release that seems applicable to this issue. Thoughts?
blais.bruno is offline   Reply With Quote

Reply

Tags
cluster, mvapich2, parallel, scaling, slow

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
[ANSYS Meshing] Periodicity problems in icem zeeshu ANSYS Meshing & Geometry 0 April 17, 2016 20:59
[OpenFOAM.org] problems with installation of OpenFOAM-2.1.1 on cluster with RHEL 6.5 lisa_china OpenFOAM Installation 1 March 29, 2016 08:08
Problems running OF on cluster kate.F OpenFOAM Running, Solving & CFD 2 January 14, 2016 12:33
Compute Cluster with diskless compute nodes Pauli Hardware 0 October 6, 2015 16:48
Some problems with Star CD Micha Siemens 0 August 6, 2003 13:55


All times are GMT -4. The time now is 20:20.