CFD Online Discussion Forums - IcoFoam in parallel Issue with speed up

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)

- OpenFOAM Running, Solving & CFD (https://www.cfd-online.com/Forums/openfoam-solving/)

- - IcoFoam in parallel Issue with speed up (https://www.cfd-online.com/Forums/openfoam-solving/58653-icofoam-parallel-issue-speed-up.html)

hjasak

July 24, 2008 03:28

Heya, Your setup looks OK.

Heya,

Your setup looks OK. Try playing with the pressure solver, starting from ICCG and then varying the AMG parameters if that does not help. You should get good performance at least up to 100 CPUs.

Please keep us (The Forum) posted in case there's a real problem.

Hrv

gschaider

July 24, 2008 04:23

Hi Senthil! Of course the o

Hi Senthil!

Of course the other interesting question is: how big is your case?
And how are your CPU's connected.

If for instance you're doing the damBreak over 100MBit ethernet, I'd be surprised if you see any speedup on 6 CPUs

Bernhard

skabilan

July 24, 2008 12:05

Hi Hrv, Thanks for the help

Hi Hrv,

Thanks for the help! Ill try your suggestions and keep the forum posted.

Bernhard: The mesh has 2.7 million elements (hybrid). Its a shared memory machine.

Regards,
Senthil

eugene

July 24, 2008 12:26

What kind of shared memory mac

What kind of shared memory machine?

skabilan

July 24, 2008 12:44

Hi Eugene, Its Altix from S

Hi Eugene,

Its Altix from SGI (http://www.sgi.com/products/servers/altix/)

The system runs a single copy of Linux over 128 Intel Itanium2 processors running at 1.5GHz. The system has 256GB of RAM of shared memory.

Thanks
Senthil

fra76

July 24, 2008 13:13

Hi Senthil, I have some bad e

Hi Senthil,
I have some bad experience with some SGI shared memory machine, where the memory is distributed on different motherboards and shared via kernel on top of an Infiniband network. Is this the case? I think so, as they write that the Altix has a "Modular blade design".
If so, you can experience very poor performance if the computer is not configure for the software you are running...

Francesco

skabilan

July 24, 2008 15:42

Hi Francesco, The same mach

Hi Francesco,

The same machine did demonstrate excellent speed up in case of a steady state simulation with simpleFoam. I got good speed up with 64 processors.

Senthil

eugene

July 25, 2008 05:25

When running on Itanium it is

When running on Itanium it is crucial that you use the Intel Icc compiler and the SGI native MPI otherwise performance will be poor.

skabilan

July 25, 2008 18:17

Hi All, When I increase the

Hi All,

When I increase the number of processors, the solver stalls while calculating the pressure equation. I think it is the problem with GAMG. Any suggestions changing the following parameters?

p GAMG
{
agglomerator faceAreaPair;
nCellsInCoarsestLevel 100;
cacheAgglomeration true;
directSolveCoarsest true;
nPreSweeps 1;
nPostSweeps 2;
nFinestSweeps 2;
tolerance 1e-05;
relTol 0.1;
smoother GaussSeidel;
mergeLevels 1;
minIter 0;
maxIter 10;
};

Thanks
Senthil

skabilan

August 13, 2008 17:45

Hi All, I have tried variou

Hi All,

I have tried various permutations and combinations for the pressure solver but I could not attain any speed up. Any Suggestions?

Thanks
Senthil

skabilan

October 3, 2011 00:09

Hi All,

This issue has been resolved! The key to this problem was changing nCellsInCoarsestLevel in GAMG setting.

Thanks
Senthil

Toorop

October 3, 2011 05:03

Hi Senthil,

could you elaborate a bit on the matter please? I have made some parallel tests and used icoFoam with GAMG and the scale up was far from perfect.

Thanks in advance!

skabilan

October 3, 2011 13:04

Hi,

First, it is impossible to get linear speedup with any code. That said, icoFoam, performed relatively good on multiple processors. When you increase the number of processors, each processor gets a smaller chunk of the mesh. Therefore, you need to decrease the nCellsInCoarsestLevel settings in GAMG (fvSolution file). You can think of this number as the number of common cells shared between two processors.
Also, OpenFOAM performed better with GNU OpenMPI 1.5 compared to the previous version (1.4).

Thanks

Toorop

October 10, 2011 11:06

Thx Senthil for the tip, but it wont change anything in my case.

My test was a simple cavity3D created with blockMesh with 10M cells. Default icoFoam solver with GAMG for pressure on a Intel Xeon E5430. Iteration count 200.
parallel thread - runtime
1 - 32,2 h
2 - 27.7 h
4 - 28.9 h
8 - 23.8 h

skabilan

October 10, 2011 13:10

Hi,

Which version of OpenFOAM are you using?

akidess

October 11, 2011 02:41

Quote:

Originally Posted by skabilan (Post 326515)

Hi,
First, it is impossible to get linear speedup with any code.

This is commonly true, but NOT impossible. See superlinear speedup.

Quote:

Therefore, you need to decrease the nCellsInCoarsestLevel settings in GAMG (fvSolution file). You can think of this number as the number of common cells shared between two processors.

I believe this is not true, because processor coupling still happens on the finest level, not the coarsest (please correct me if I'm wrong)! Also, if you set nCellsInCoarsestLevel too low, you'll spend more time on interpolation and restriction than you gain from the coarse level solution.

Toorop

October 11, 2011 03:14

Hi,
OS: Ubuntu 11.04
OF: a month old 2.0.x

skabilan

October 11, 2011 13:02

Anton,

You are correct! I need to rephrase my wordings. I meant we cannot get linear speedup with OpenFoam utilities (at least the ones I have used). We will run into the interpolation issues if we have the nCellsInCoarsestLevel set to a very small number. It has to appropriate with the total number of elements on each processor.

akidess

October 11, 2011 13:20

Personally I've always encountered lower than linear speedup as well, but some people have claimed to get close to linear or better speedup using OpenFOAM:
http://www.hpcadvisorycouncil.com/pd...M_at_Scale.pdf

http://web.student.chalmers.se/group...SlidesOFW5.pdf

wyldckat

October 11, 2011 17:07

Greetings to all!

I would like to add to the knowledge being shared here and point you all to the following post: Parallel processing of OpenFOAM cases on multicore processor??? post #11 - it's a bit of a rant, but several scalability notions I've gotten with OpenFOAM and have gathered over time are written there.

As for super-linear scalability, AFAIK it's only a half-truth: it's only super linear when we don't take into account the timings in serial mode! In other words, we can observe something similar on the "Report" I mention on that post above, where scalability sky-rockets when we don't take into account the serial run and the real theoretical speedup that should be expected.

I've been trying to gather as much information on this subject as possible and have been keeping track of stuff on the blog post mentioned at the end of the post I mention above, namely: Notes about running OpenFOAM in parallel

Best regards,
Bruno

All times are GMT -4. The time now is 21:26.