CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

Scalability of OpenFOAM

Register Blogs Community New Posts Updated Threads Search

Like Tree8Likes
  • 5 Post By x86_64leon
  • 1 Post By flotus1
  • 1 Post By x86_64leon
  • 1 Post By joshwilliams

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   December 20, 2018, 21:32
Default Scalability of OpenFOAM
  #1
Senior Member
 
Peter Shi
Join Date: Feb 2017
Location: Davis
Posts: 102
Rep Power: 9
PeterShi is on a distinguished road
Hi all,

What is your experience on the Scalability of OpenFOAM? Is it scalable up to like more than 1,000 CPUs for a large-scale simulation?

Thank you in advance.

Best regards,
Peter
PeterShi is offline   Reply With Quote

Old   December 21, 2018, 03:56
Default
  #2
Cyp
Senior Member
 
Cyprien
Join Date: Feb 2010
Location: Stanford University
Posts: 299
Rep Power: 18
Cyp is on a distinguished road
Hi Peter,

I will say it depends on the kind of simulations you are interested in. You can have a look at these papers for solver-specific scalability analysis:

https://www.sciencedirect.com/science/article/pii/S0010465514003403


https://www.sciencedirect.com/science/article/pii/S0010465514002719


https://link.springer.com/article/10.1007/s11242-015-0458-0


Cheers,
Cyprien
Cyp is offline   Reply With Quote

Old   December 21, 2018, 11:17
Default
  #3
Senior Member
 
Peter Shi
Join Date: Feb 2017
Location: Davis
Posts: 102
Rep Power: 9
PeterShi is on a distinguished road
Quote:
Originally Posted by Cyp View Post
Hi Peter,

I will say it depends on the kind of simulations you are interested in. You can have a look at these papers for solver-specific scalability analysis:

https://www.sciencedirect.com/science/article/pii/S0010465514003403


https://www.sciencedirect.com/science/article/pii/S0010465514002719


https://link.springer.com/article/10.1007/s11242-015-0458-0


Cheers,
Cyprien
Hi Cyprien

Your papers are very useful. Thank you.

Best regards,
Peter
PeterShi is offline   Reply With Quote

Old   August 26, 2019, 04:30
Default
  #4
New Member
 
Lionel GAMET
Join Date: Nov 2013
Location: Lyon
Posts: 17
Rep Power: 12
x86_64leon is on a distinguished road
Hi everybody


Has anyone already measured the intranode scalability for OpenFOAM ?



We have tried to do this on 2 different machines and obtained the same kind of trend. I've attached the results on skylake processors as a pdf. Runs are done inside a single node that contains 48 cores. This is thus showing only intranode scalability. Note that these tests have been performed by varying the number of processors in decomposeParDict AND by reserving each time a full node of 48 cores so that no other process can perturbate the computations.



intranode scalability is a very important measure of performance. It will show the efficiency of OpenFOAM on a give hardware architecture.



We have tested different solvers (interFoam, interIsoFoam, pimpleFoam and simpleFoam) and different sizes of grids. The pdf curve attached to this post shows :

  • We seldom reach the theoretical speedup line, and never when the node is 100% charged.
  • When the node is full, we reach somehow about 50% of the theoretical speedup This is a bit disappointing.
  • Some solvers experience a very wiggly behaviour when the number of cores is varied, like the GAMG interFoam (black empty triangles)
  • The PCG linear solver gives smoothers curves than GAMG.


Any comments or share of experience are very welcome


Best
Attached Files
File Type: pdf scalabilityIntranode_allFoams_Irene.pdf (15.2 KB, 165 views)
x86_64leon is offline   Reply With Quote

Old   August 26, 2019, 07:01
Default
  #5
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
There are quite a few results here, albeit only for a single case: OpenFOAM benchmarks on various hardware
Less-than-ideal strong intra-node scaling is to be expected and says nothing about the quality of the parallel implementation. CFD solvers with unstructured meshes all share the same problem: they run into a memory bandwidth bottleneck beyond 2-3 cores per memory channel.
Santiago likes this.
flotus1 is offline   Reply With Quote

Old   August 26, 2019, 12:17
Default
  #6
Senior Member
 
Peter Shi
Join Date: Feb 2017
Location: Davis
Posts: 102
Rep Power: 9
PeterShi is on a distinguished road
Quote:
Originally Posted by x86_64leon View Post
Hi everybody


Has anyone already measured the intranode scalability for OpenFOAM ?



We have tried to do this on 2 different machines and obtained the same kind of trend. I've attached the results on skylake processors as a pdf. Runs are done inside a single node that contains 48 cores. This is thus showing only intranode scalability. Note that these tests have been performed by varying the number of processors in decomposeParDict AND by reserving each time a full node of 48 cores so that no other process can perturbate the computations.



intranode scalability is a very important measure of performance. It will show the efficiency of OpenFOAM on a give hardware architecture.



We have tested different solvers (interFoam, interIsoFoam, pimpleFoam and simpleFoam) and different sizes of grids. The pdf curve attached to this post shows :

  • We seldom reach the theoretical speedup line, and never when the node is 100% charged.
  • When the node is full, we reach somehow about 50% of the theoretical speedup This is a bit disappointing.
  • Some solvers experience a very wiggly behaviour when the number of cores is varied, like the GAMG interFoam (black empty triangles)
  • The PCG linear solver gives smoothers curves than GAMG.


Any comments or share of experience are very welcome


Best
Hello Lionel,

Thank you for sharing. While you conducted small-scale tests, I tested OpenFOAM scalability with the number of CPU varies from 512 to 4096 using KNL nodes. The speedup from 512 to 4096 is supposed to be 8, however the reality is a little below 4. Thus, the corresponding efficiency is below 50%. My mesh has 12 million cells and the solver I used is simpleFoam.

Hope it helps.

Best,
Peter
PeterShi is offline   Reply With Quote

Old   August 26, 2019, 16:28
Default
  #7
New Member
 
Lionel GAMET
Join Date: Nov 2013
Location: Lyon
Posts: 17
Rep Power: 12
x86_64leon is on a distinguished road
Hi Peter,

What you have observed is absolutely normal when you put too many cores. We have also measured that. There is in fact an optimum in terms of number of cells per core. If there are too many cells per core, then you have not enough parallelized you run and you have bad performance. If you put too many cores, then you are too much parallelized and you spend all your time in communications ... so that you also have bad performances.

I join a curve plotting the global CPU time per cell per iteration plotted against the number of cells per core. You will see that there is an optimal range ... too much parallelization (on the left of the curve) and you get bad performance. Not enough parallelization (on the right of the curve) and you also get bad performance, although the difference is less important.

However, be careful, as this (like also speedup in extra nodes) does not show anything about the performances INSIDE a node, which was at the origin of my question.

Just to go a bit further, with 12 million cells on 4096 cores, you will only get around 3000 cells per core ... you are then too much parallelized in your case, so that performance goes down. This explain your 50% break of performance.

With 2 million cells on 48 cores, I will get a minimum of 41667 cell per core. So, I'm still in the good performance range !

Best
x86_64leon is offline   Reply With Quote

Old   August 26, 2019, 16:29
Default
  #8
New Member
 
Lionel GAMET
Join Date: Nov 2013
Location: Lyon
Posts: 17
Rep Power: 12
x86_64leon is on a distinguished road
With the file ....
Attached Files
File Type: pdf scalabilityCPUPerDtPerCell_06750000cells-Bubbles3D.pdf (8.4 KB, 156 views)
Ben D. likes this.
x86_64leon is offline   Reply With Quote

Old   August 26, 2019, 16:41
Default
  #9
Senior Member
 
Peter Shi
Join Date: Feb 2017
Location: Davis
Posts: 102
Rep Power: 9
PeterShi is on a distinguished road
Quote:
Originally Posted by x86_64leon View Post
With the file ....
Hello Lionel,

You are absolutely right. I do realize there should be a lower bound of cells per CPU for the best performance. In my case, I do not think I will go above 2048 CPUs.

Not sure if you know an open-source CFD solver called Nek5000. It is highly scalable up to millions of CPUs, as long as the number of elements exceeds a certain value (~60).

Best regards,
PeterShi is offline   Reply With Quote

Old   August 16, 2022, 02:52
Default
  #10
Senior Member
 
Josh Williams
Join Date: Feb 2021
Location: Scotland
Posts: 112
Rep Power: 5
joshwilliams is on a distinguished road
Does anyone have any idea how well OpenFOAM scales for particle-laden flows? For example, I am aware there is a lower bound for number of cores/CPU for parallelisation efficiency. When there is a low number of particles, one can assume that the parallelisation efficiency would be fairly similar.


I am interested in cases with not many cells (maybe between 500k and 2 million), but large number of particles (lets say, upwards of 50 million, reaching perhaps a maximum of one billion). Then, is the parallelisation mainly dominated by number of particles / CPU? Or is it a combination of number of particles and number of cells per CPU?



Thanks,
Josh
joshwilliams is offline   Reply With Quote

Old   August 17, 2022, 03:52
Default
  #11
Senior Member
 
Tom Fahner
Join Date: Mar 2009
Location: Breda, Netherlands
Posts: 634
Rep Power: 32
tomf will become famous soon enoughtomf will become famous soon enough
Send a message via MSN to tomf Send a message via Skype™ to tomf
Hi Josh,

Just a few thoughts from my side, I am also keen to learn more.

My experience with Lagrangian particles is mainly in running (ico)UncoupledKinematicParcelFoam where there is no update on the fluid side, so there the scalability is only related to the number of particles. The main issue I found was that there is typically a clustering of particles in just a few cells, so that limits scalability. I am not sure if load balancing can be achieved based on the locations of the particles. As the solver ran pretty quickly without the fluid side updating it was not really worth my effort to optimize.

For coupled approaches there will be a balance between the fluid field (how complicated is the model: reactions/heat exchange/turbulence may all influence the amount of time spent in the fluid part) and the number of particles per processor and their distribution.

There may be optimization in using the collated decomposition method, but I never tested that. Furthermore there could be manual decomposition to have more processors clustered around the area with a lot of particles and less on other areas?

Cheers,
Tom
tomf is offline   Reply With Quote

Old   August 17, 2022, 11:57
Default
  #12
Senior Member
 
Josh Williams
Join Date: Feb 2021
Location: Scotland
Posts: 112
Rep Power: 5
joshwilliams is on a distinguished road
Hi Tom,

Yes, the issue you describe is one I also experience. In my simulations, all of the particles are clustered at the inlet for around 20% of the total run time, then they begin to disperse evenly throughout the domain and eventually exit the outlets. For me, dynamic load balancing would be excellent, but it is not available except for a few github repositories focused on specific problems.

We have recently gained some funding to perform large-scale simulations on cloud computing HPC, where we aim to simulate number of particles in the order of 100 million and maybe upwards. Hopefully, I get some interesting results to share with the community.

Best,
Josh
tomf likes this.
joshwilliams is offline   Reply With Quote

Old   August 18, 2022, 13:58
Default
  #13
Senior Member
 
Domenico Lahaye
Join Date: Dec 2013
Posts: 723
Blog Entries: 1
Rep Power: 17
dlahaye is on a distinguished road
Are these useful?

https://exafoam.eu/wp5/


https://prace-ri.eu/wp-content/uploa...-Framework.pdf
dlahaye is offline   Reply With Quote

Old   August 19, 2022, 12:49
Default
  #14
Senior Member
 
Josh Williams
Join Date: Feb 2021
Location: Scotland
Posts: 112
Rep Power: 5
joshwilliams is on a distinguished road
Quote:
Originally Posted by dlahaye View Post
Thanks, Domenico. The exaFoam one is helpful. I am eagerly awaiting publications from the project! I think one of the main target areas is linear equation solvers (hence the PETSc paper). I think with modifications to the code, it can be made much easier to implement on varying heterogeneous architectures. I am not sure how much they are doing for Lagrangian particle tracking, but like I said, I am very interested in any upcoming results on this from the project!
joshwilliams is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
OpenFOAM Foundation releases OpenFOAMŪ 3.0.0 CFDFoundation OpenFOAM Announcements from OpenFOAM Foundation 1 November 7, 2015 15:16
OpenFOAM Training, London, Chicago, Munich, Sep-Oct 2015 cfd.direct OpenFOAM Announcements from Other Sources 2 August 31, 2015 13:36
OpenFOAM Foundation releases OpenFOAM 2.2.2 opencfd OpenFOAM Announcements from ESI-OpenCFD 0 October 14, 2013 07:18
Cross-compiling OpenFOAM 1.7.0 on Linux for Windows 32 and 64bits with Mingw-w64 wyldckat OpenFOAM Announcements from Other Sources 3 September 8, 2010 06:25
The OpenFOAM extensions project mbeaudoin OpenFOAM 16 October 9, 2007 09:33


All times are GMT -4. The time now is 17:44.