|
[Sponsors] |
![]() |
![]() |
#1 |
New Member
Join Date: Dec 2018
Posts: 6
Rep Power: 6 ![]() |
Dear All!
In my workplace we have a new AMD based cluster to use OpenFOAM 19.06 for steady-state incompressible turbolent simulations with upper 40 millions cells mesh. - 2xAMD Epyc 7702 (2x 64 cores); - ram 256 GB DDR4; - hard disk RAID 5 - CentOS 7.7 Now, we have some problems using many cores simultaneously. As benchmark I ran simultaneously a simpleFoam single core case with an airfoil mesh (500'000 tetra cells). Using 4 cores test takes about 1200 s and on 128 cores about 4 hours. But, we noted many different single core performances. Time differences through cores increase as increasing cores used. For you, what can cause different single core performance? We ran also a simpleFoam case with about 15 millions cells mesh for 50 iterations. On 16 cores test takes 660 s, while 600 s on 32 cores. We ran same tests also in an Intel cluster with 2xIntel Xeon Gold (28 cores in total). After the first test, we noted very similar time for all cores used. Running the second case (15 millions tetras mesh) on 28 cores, it takes about 400 s. For now, we are disappointed, because we read about excellent multi-cores performance on AMD Epyc socket. Have anyone experiences about OpenFOAM scalability and performance on AMD Epyc 7002? Thank you very much! |
|
![]() |
![]() |
![]() |
![]() |
#2 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,307
Rep Power: 44 ![]() ![]() |
So far, there is one Epyc Rome result in the benchmark thread. It took first place as far as dual-socket systems are concerned.
OpenFOAM benchmarks on various hardware So in theory, such a system can be fast in OpenFOAM. In practice, performance can depend on a lot of factors. A few things you should check: Use test cases that are large enough. 500k cells is definitely too small for 128 cores. Disable SMT in the bios Make sure the CPU clock speed is in the proper range when the system is under load, e.g. using turbostat Check memory configuration. You need 16 DIMMs of DDR4-3200, populated in the correct DIMM slots. Check how the system distributes the threads across the cores, e.g. using htop. You can also try a newer operating system. CentOS 8 finally switched to a 4.x kernel version, which might be better for bleeding edge hardware like yours. And last not least: adjust expectations. I would not expect much scaling beyond 64 cores, due to memory bandwidth limitations. Edit: also, "hard disk RAID5"... do your timing checks include meshing and I/O times, or do you only look at solver times? |
|
![]() |
![]() |
![]() |
![]() |
#3 |
New Member
Join Date: Dec 2018
Posts: 6
Rep Power: 6 ![]() |
Thank you for your answer! I'll check those.
I used 500k cells because I ran it on single core n times simultaneously. My timing checks include only solver times. |
|
![]() |
![]() |
![]() |
![]() |
#4 |
New Member
Leo Natan
Join Date: Dec 2019
Posts: 6
Rep Power: 5 ![]() |
Disabled SMT in the BIOS and everything is ok now!
|
|
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
[ICEM] Problems with coedge curves and surfaces | tommymoose | ANSYS Meshing & Geometry | 6 | December 1, 2020 11:12 |
New 128 mini cluster - Cascade Lake SP or EPYC Rome? | SLC | Hardware | 8 | December 16, 2019 16:25 |
Unforeseen problems in scaling up a cluster built with desktop parts? | kyle | Hardware | 22 | January 18, 2012 13:46 |
Linux Cluster Setup Problems | Bob | CFX | 1 | October 3, 2002 18:08 |
AMD Athlon problems? | Kenji Takeda | FLUENT | 10 | December 15, 2000 00:36 |