Heya,
Your setup looks OK.
Heya,
Your setup looks OK. Try playing with the pressure solver, starting from ICCG and then varying the AMG parameters if that does not help. You should get good performance at least up to 100 CPUs. Please keep us (The Forum) posted in case there's a real problem. Hrv |
Hi Senthil!
Of course the o
Hi Senthil!
Of course the other interesting question is: how big is your case? And how are your CPU's connected. If for instance you're doing the damBreak over 100MBit ethernet, I'd be surprised if you see any speedup on 6 CPUs Bernhard |
Hi Hrv,
Thanks for the help
Hi Hrv,
Thanks for the help! Ill try your suggestions and keep the forum posted. Bernhard: The mesh has 2.7 million elements (hybrid). Its a shared memory machine. Regards, Senthil |
What kind of shared memory mac
What kind of shared memory machine?
|
Hi Eugene,
Its Altix from S
Hi Eugene,
Its Altix from SGI (http://www.sgi.com/products/servers/altix/) The system runs a single copy of Linux over 128 Intel Itanium2 processors running at 1.5GHz. The system has 256GB of RAM of shared memory. Thanks Senthil |
Hi Senthil,
I have some bad e
Hi Senthil,
I have some bad experience with some SGI shared memory machine, where the memory is distributed on different motherboards and shared via kernel on top of an Infiniband network. Is this the case? I think so, as they write that the Altix has a "Modular blade design". If so, you can experience very poor performance if the computer is not configure for the software you are running... Francesco |
Hi Francesco,
The same mach
Hi Francesco,
The same machine did demonstrate excellent speed up in case of a steady state simulation with simpleFoam. I got good speed up with 64 processors. Senthil |
When running on Itanium it is
When running on Itanium it is crucial that you use the Intel Icc compiler and the SGI native MPI otherwise performance will be poor.
|
Hi All,
When I increase the
Hi All,
When I increase the number of processors, the solver stalls while calculating the pressure equation. I think it is the problem with GAMG. Any suggestions changing the following parameters? p GAMG { agglomerator faceAreaPair; nCellsInCoarsestLevel 100; cacheAgglomeration true; directSolveCoarsest true; nPreSweeps 1; nPostSweeps 2; nFinestSweeps 2; tolerance 1e-05; relTol 0.1; smoother GaussSeidel; mergeLevels 1; minIter 0; maxIter 10; }; Thanks Senthil |
Hi All,
I have tried variou
Hi All,
I have tried various permutations and combinations for the pressure solver but I could not attain any speed up. Any Suggestions? Thanks Senthil |
Hi All,
This issue has been resolved! The key to this problem was changing nCellsInCoarsestLevel in GAMG setting. Thanks Senthil |
Hi Senthil,
could you elaborate a bit on the matter please? I have made some parallel tests and used icoFoam with GAMG and the scale up was far from perfect. Thanks in advance! |
Hi,
First, it is impossible to get linear speedup with any code. That said, icoFoam, performed relatively good on multiple processors. When you increase the number of processors, each processor gets a smaller chunk of the mesh. Therefore, you need to decrease the nCellsInCoarsestLevel settings in GAMG (fvSolution file). You can think of this number as the number of common cells shared between two processors. Also, OpenFOAM performed better with GNU OpenMPI 1.5 compared to the previous version (1.4). Thanks |
Thx Senthil for the tip, but it wont change anything in my case.
My test was a simple cavity3D created with blockMesh with 10M cells. Default icoFoam solver with GAMG for pressure on a Intel Xeon E5430. Iteration count 200. parallel thread - runtime 1 - 32,2 h 2 - 27.7 h 4 - 28.9 h 8 - 23.8 h |
Hi,
Which version of OpenFOAM are you using? |
Quote:
Quote:
|
Hi,
OS: Ubuntu 11.04 OF: a month old 2.0.x |
Anton,
You are correct! I need to rephrase my wordings. I meant we cannot get linear speedup with OpenFoam utilities (at least the ones I have used). We will run into the interpolation issues if we have the nCellsInCoarsestLevel set to a very small number. It has to appropriate with the total number of elements on each processor. |
Personally I've always encountered lower than linear speedup as well, but some people have claimed to get close to linear or better speedup using OpenFOAM:
http://www.hpcadvisorycouncil.com/pd...M_at_Scale.pdf http://web.student.chalmers.se/group...SlidesOFW5.pdf |
Greetings to all!
I would like to add to the knowledge being shared here and point you all to the following post: Parallel processing of OpenFOAM cases on multicore processor??? post #11 - it's a bit of a rant, but several scalability notions I've gotten with OpenFOAM and have gathered over time are written there. As for super-linear scalability, AFAIK it's only a half-truth: it's only super linear when we don't take into account the timings in serial mode! In other words, we can observe something similar on the "Report" I mention on that post above, where scalability sky-rockets when we don't take into account the serial run and the real theoretical speedup that should be expected. I've been trying to gather as much information on this subject as possible and have been keeping track of stuff on the blog post mentioned at the end of the post I mention above, namely: Notes about running OpenFOAM in parallel Best regards, Bruno |
All times are GMT -4. The time now is 21:26. |