CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

Performance problems on AMD Epyc cluster

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   December 28, 2019, 03:58
Default Performance problems on AMD Epyc cluster
  #1
New Member
 
Join Date: Dec 2018
Posts: 6
Rep Power: 7
crpvn is on a distinguished road
Dear All!

In my workplace we have a new AMD based cluster to use OpenFOAM 19.06 for steady-state incompressible turbolent simulations with upper 40 millions cells mesh.
- 2xAMD Epyc 7702 (2x 64 cores);
- ram 256 GB DDR4;
- hard disk RAID 5
- CentOS 7.7

Now, we have some problems using many cores simultaneously. As benchmark I ran simultaneously a simpleFoam single core case with an airfoil mesh (500'000 tetra cells).

Using 4 cores test takes about 1200 s and on 128 cores about 4 hours. But, we noted many different single core performances.
Time differences through cores increase as increasing cores used.

For you, what can cause different single core performance?

We ran also a simpleFoam case with about 15 millions cells mesh for 50 iterations. On 16 cores test takes 660 s, while 600 s on 32 cores.

We ran same tests also in an Intel cluster with 2xIntel Xeon Gold (28 cores in total).
After the first test, we noted very similar time for all cores used.
Running the second case (15 millions tetras mesh) on 28 cores, it takes about 400 s.

For now, we are disappointed, because we read about excellent multi-cores performance on AMD Epyc socket.

Have anyone experiences about OpenFOAM scalability and performance on AMD Epyc 7002?

Thank you very much!
crpvn is offline   Reply With Quote

Old   December 28, 2019, 07:23
Default
  #2
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,406
Rep Power: 47
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
So far, there is one Epyc Rome result in the benchmark thread. It took first place as far as dual-socket systems are concerned.
OpenFOAM benchmarks on various hardware

So in theory, such a system can be fast in OpenFOAM. In practice, performance can depend on a lot of factors. A few things you should check:
Use test cases that are large enough. 500k cells is definitely too small for 128 cores.
Disable SMT in the bios
Make sure the CPU clock speed is in the proper range when the system is under load, e.g. using turbostat
Check memory configuration. You need 16 DIMMs of DDR4-3200, populated in the correct DIMM slots.
Check how the system distributes the threads across the cores, e.g. using htop.
You can also try a newer operating system. CentOS 8 finally switched to a 4.x kernel version, which might be better for bleeding edge hardware like yours.
And last not least: adjust expectations. I would not expect much scaling beyond 64 cores, due to memory bandwidth limitations.

Edit: also, "hard disk RAID5"... do your timing checks include meshing and I/O times, or do you only look at solver times?
flotus1 is offline   Reply With Quote

Old   December 30, 2019, 07:14
Default
  #3
New Member
 
Join Date: Dec 2018
Posts: 6
Rep Power: 7
crpvn is on a distinguished road
Thank you for your answer! I'll check those.

I used 500k cells because I ran it on single core n times simultaneously.

My timing checks include only solver times.
crpvn is offline   Reply With Quote

Old   February 17, 2020, 08:50
Default
  #4
New Member
 
Leo Natan
Join Date: Dec 2019
Posts: 6
Rep Power: 6
crestang is on a distinguished road
Disabled SMT in the BIOS and everything is ok now!
crestang is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
[ICEM] Problems with coedge curves and surfaces tommymoose ANSYS Meshing & Geometry 6 December 1, 2020 11:12
New 128 mini cluster - Cascade Lake SP or EPYC Rome? SLC Hardware 8 December 16, 2019 16:25
Unforeseen problems in scaling up a cluster built with desktop parts? kyle Hardware 22 January 18, 2012 13:46
Linux Cluster Setup Problems Bob CFX 1 October 3, 2002 18:08
AMD Athlon problems? Kenji Takeda FLUENT 10 December 15, 2000 00:36


All times are GMT -4. The time now is 14:43.