CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

EPYC 9004 memory bandwidth bottleneck threshold

Register Blogs Community New Posts Updated Threads Search

Like Tree3Likes
  • 2 Post By Manuelo
  • 1 Post By MangoNrFive

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   July 9, 2023, 12:58
Default EPYC 9004 memory bandwidth bottleneck threshold
  #1
New Member
 
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 12
Manuelo is on a distinguished road
Hi everyone, in my company we will be buying a new workstation for CFD soon. Our main applications are related to combustion with medium/large size models (around 4-5 millions cells).
After going throughout the hardware posts in the forum, it is clear that memory bandwidth is the main parameter that one have to keep in mind to optimize the investment. Because of this I am proposing going for AMD epyc 9004 cpu's with 2-sockets. This way, I will have 24 memory channels with DDR5 4800 MHz. I believe this is the highest memory bacwidth that nowadays can be achieved in a single workstation.

This is the setup I have preliminary prepared:

- Gigabyte Mainboard MZ73-LM0
- 2 x AMD EPYC 9554 - 3.10 GHz - 64 cores / 256 MB cache - 360W
- 24 x 16GB DDR5 ECC registered
- 1.6TB Samsung PM1735
- GeForce RTX 4070 TI - 12GB

I am not sure what cpu I should look for. There is a range of possibilites from 16 to 128 cores per cpu and different core speeds (https://www.amd.com/es/processors/epyc-9004-series). From what I have learnt in the forum, there is probably a certain point from which it makes no sense to increase further the number of cores/frecuency because memory bandwidth will limit the maximum throughput for CFD calculation. The cpu I have selected is a bit expensive, so I wonder if it is a waste of money to go for such a powerful one, since I will not be able to take advantage of it. Maybe going for:

- AMD EPYC 9354 - 3.25 GHz - 32 cores / 256 MB cache - 280W

will save almost 7.000€ and the performance will be similar.


I would appreciate any advice in this regards.
Manuelo is offline   Reply With Quote

Old   July 10, 2023, 06:25
Default
  #2
New Member
 
Yannick
Join Date: May 2018
Posts: 12
Rep Power: 7
ym92 is on a distinguished road
Hi Manuel,


recently, another user here in the forum and I compared our 2x9474F and 2x9454 setups on a 24M mesh (drivaerFastback tutorial) and there was basically no difference. I think these results indicate that the bottleneck of memory bandwidth was already reached at 2x48 cores (but probably still faster than 2x32 cores). Hope that helps your decision.
ym92 is offline   Reply With Quote

Old   July 10, 2023, 10:52
Default
  #3
New Member
 
Manuel
Join Date: Sep 2013
Location: Spain
Posts: 12
Rep Power: 12
Manuelo is on a distinguished road
Hi ym92, thanks for your reply. It certainly helps a lot. I can make the following numbers on total computation power, all of them for 2-sockets:

- EPYC 9354: 2x32x3,25 = 208,0 Ghz
- EPYC 9454: 2x48x2,75 = 264,0 Ghz
- EPYC 9474F: 2x48x3,60 = 345,6 Ghz
- EPYC 9554: 2x64x3,10 = 396,8 Ghz

All of them have the same amount of cache size (256 MB). I believe that, if going from 9454 to 9474F does not provide any additional benefit on your benchmark, I would bet that it does not make sense to invest on 9554, at least when it comes to CFD solving capability throughput.

I still have the doubt if adding more physics into the calculation (my cases usually have stiff chemistry and radiation) may change the game somehow.

I have found another info and I wonder if it may be reliable in order to take a decision. The webpage from amd epyc-9004 (https://www.amd.com/es/processors/epyc-9004-series) has a list of all the processors available. Clicking in any of them, takes you to a specific webpage for the processor including workload affinity tags. For example, for EPYC 9554, this is the list of workload affinity that is proposed:

Workload Affinity
Analytics
CAE/CFD/FEA
ERM/SCM/CRM apps
High capacity data mgmt (NR/RDBMS)
HPC
VM Density

So looks like, according to AMD criteria, EPYC 9554 is suited for CFD workload. I have checked the whole processors list and these are the ones with CFD indicated:

AMD EPYC™ 9754 128 256 Up to 3.1GHz 3.1GHz 2.25GHz 256MB 360W
AMD EPYC™ 9754S 128 128 Up to 3.1GHz 3.1GHz 2.25GHz 256MB 360W
AMD EPYC™ 9734 112 224 Up to 3.0GHz 3.0GHz 2.2GHz 256MB 340W
AMD EPYC™ 9684X 96 192 Up to 3.7GHz 3.42GHz 2.55GHz 1152MB 400W CFD
AMD EPYC™ 9384X 32 64 Up to 3.9GHz 3.5GHz 3.1GHz 768MB 320W CFD
AMD EPYC™ 9184X 16 32 Up to 4.2GHz 3.85GHz 3.55GHz 768MB 320W CFD
AMD EPYC™ 9654P 96 192 Up to 3.7GHz 3.55GHz 2.4GHz 384MB 360W
AMD EPYC™ 9654 96 192 Up to 3.7GHz 3.55GHz 2.4GHz 384MB 360W
AMD EPYC™ 9634 84 168 Up to 3.7GHz 3.1GHz 2.25GHz 384MB 290W
AMD EPYC™ 9554P 64 128 Up to 3.75GHz 3.75GHz 3.1GHz 256MB 360W
AMD EPYC™ 9554 64 128 Up to 3.75GHz 3.75GHz 3.1GHz 256MB 360W CFD
AMD EPYC™ 9534 64 128 Up to 3.7GHz 3.55GHz 2.45GHz 256MB 280W
AMD EPYC™ 9474F 48 96 Up to 4.1GHz 3.95GHz 3.6GHz 256MB 360W CFD
AMD EPYC™ 9454P 48 96 Up to 3.8GHz 3.65GHz 2.75GHz 256MB 290W CFD
AMD EPYC™ 9454 48 96 Up to 3.8GHz 3.65GHz 2.75GHz 256MB 290W CFD
AMD EPYC™ 9374F 32 64 Up to 4.3GHz 4.1GHz 3.85GHz 256MB 320W Per core CFD
AMD EPYC™ 9354P 32 64 Up to 3.8GHz 3.75GHz 3.25GHz 256MB 280W
AMD EPYC™ 9354 32 64 Up to 3.8GHz 3.75GHz 3.25GHz 256MB 280W CFD
AMD EPYC™ 9334 32 64 Up to 3.9GHz 3.85GHz 2.7GHz 128MB 210W CFD
AMD EPYC™ 9274F 24 48 Up to 4.3GHz 4.1GHz 4.05GHz 256MB 320W Per core CFD
AMD EPYC™ 9254 24 48 Up to 4.15GHz 3.9GHz 2.9GHz 128MB 200W Per core CFD
AMD EPYC™ 9224 24 48 Up to 3.7GHz 3.65GHz 2.5GHz 64MB 200W
AMD EPYC™ 9174F 16 32 Up to 4.4GHz 4.15GHz 4.1GHz 256MB 320W Per core CFD
AMD EPYC™ 9124 16 32 Up to 3.7GHz 3.6GHz 3.0GHz 64MB 200W Per core CFD

What I see is:

- Processors with an -X suffix are proposed for CFD. I understand that the huge amount of cache featured in these processors makes them useful for CFD?
- A few processors are indicated to have per core CFD affinity. Those are the ones with higher base clock, which I understand it make sense to use for non-paralelized calculations.
- For the rest of the preocessors, it looks like it does not make sense to go beyond the 9554.

Regards.
arvindpj and Crowdion like this.
Manuelo is offline   Reply With Quote

Old   March 1, 2024, 04:01
Default
  #4
New Member
 
Join Date: Feb 2024
Location: Spain
Posts: 16
Rep Power: 2
Sharku is on a distinguished road
Quote:
Originally Posted by Manuelo View Post
AMD EPYC™ 9354P 32 64 Up to 3.8GHz 3.75GHz 3.25GHz 256MB 280W
AMD EPYC™ 9354 32 64 Up to 3.8GHz 3.75GHz 3.25GHz 256MB 280W CFD
I am wondering why the 9354 is suggested for CFD whereas the 9354P is not. Apparently, looking at the full specifications, the only difference is that the former is suitable for mounting dual CPUs. I was thinking about buying a workstation with a single 9354P CPU because it is relatively cheap compared to other Zen4 Epyc processors, and my budget does not allow for two of them. Do you think it would a bad idea? I would use it to analyze thermal behavior of complex electric machines using Ansys Maxwell and Ansys Fluent.
Sharku is offline   Reply With Quote

Old   March 12, 2024, 07:49
Default
  #5
Member
 
Philipp Wiedemer
Join Date: Dec 2016
Location: Munich, Germany
Posts: 42
Rep Power: 9
MangoNrFive is on a distinguished road
The advantage of the 9354 is, that it allows for twice the memory-bandwidth, cache and computing-power so you get about twice the compute-performance for less than twice the price which tends to be a really good deal as performance/price is usually degressive and not progressive. You need to consider licensing-costs for the CFD-software as well though as this can change the evaluation of the setup.

If your Budget is too low for a dual-socket EPYC-Genoa I would take a look at a dual-socket setup from an older EPYC series (Milan or maybe even Rome) which could provide you with more bang-for-your-buck than a current gen single socket.

Also look here for more info
Sharku likes this.
MangoNrFive is offline   Reply With Quote

Old   April 10, 2024, 09:35
Default
  #6
Member
 
Matt
Join Date: May 2011
Posts: 43
Rep Power: 14
the_phew is on a distinguished road
It's strange that the 9384X isn't listed as 'Per Core CFD'. As a user of a commercial solver that charges several times the hardware cost every year just to license the solver on those cores, I determined that the 32-core 3D cache 9384X was the ideal 'Per Core CFD' CPU. The frequency-optimized 'F' variants seldom solve any faster than their non-F counterparts, but the 3D cache 'X' variants often achieve the best per-core performance.

The openbenchmarking.org OpenFOAM benchmarks show bizarre under-performance of the 9384X, but there are many similar anomalies in those benchmark results. Published benchmarks from Siemens and ANSYS almost universally show the 9384X achieving the best performance-per-core in CFD.
the_phew is offline   Reply With Quote

Reply

Tags
bandwidth, cpu, epyc, workstation


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
OpenFOAM benchmarks on various hardware eric Hardware 778 April 23, 2024 16:56
General recommendations for CFD hardware [WIP] flotus1 Hardware 18 February 29, 2024 12:48
Workstation Suggestions For A Newbie mrtcnsmgr Hardware 1 February 22, 2023 01:13
AMD Epyc 9004 "Genoa" buyers guide for CFD flotus1 Hardware 8 January 16, 2023 05:23
CPU for Flow3d mik_urb Hardware 4 December 4, 2022 22:06


All times are GMT -4. The time now is 06:39.