Some fundamental questions on the hardware selection

Habib-CFD · February 2, 2020, 22:54

Hi friends,

Assuming the RAM indicates a compromise between latency and frequency e.g. CL14 and 3000 MHz and the cores made by the highest IPC (ZEN2 or Cascade Lake-X) with fixed frequency such as 3GHz. Let us ignore the OS compatibility and Math library.

I am very interested to know, how many cores can feed by each memory channel in a CFD simulation?

Is it true that this ratio for an Intel CPU is 3 but for AMD counted as 4?
How about the other parameters that affect this ratio such as:
Frequency of cores and the cell number of simulation.

Thank you.

Simbelmynë · February 3, 2020, 02:27

There is no direct answer to your question.

If you look at the results in the benchmark thread then you see that most systems improve to approx. 3 cores / memory channel. However above 2 cores / memory channel there are serious diminishing returns, except for very high core counts where each individual core might have lower core frequency.

There is no difference in Intel/AMD with regards to the above.

flotus1 · February 6, 2020, 15:09

Quote:

Assuming the RAM indicates a compromise between latency and frequency e.g. CL14 and 3000 MHz

Not sure if this is what you have in mind, but there is no trade-off between frequency and latency. Sure, CL and other memory latencies have been steadily increasing along with memory frequency. But these latency labels indicate clock cycles, not time.
I.e. DDR4-2133 CL 10 has, at least in theory, the same latency (as in access time) as DDR4-4266 CL20.

Quote:

how many cores can feed by each memory channel in a CFD simulation?

Well, it depends

Here is a quick refresher about machine balance and code balance: https://www.cc.gatech.edu/~echow/ipc...-proclevel.pdf
Machine balance can be determined from the specs of a CPU and its memory subsystem. Code balance depends on the CFD package, solver and settings.
The simplest example here: run the same case in single and double precision. The latter will require almost twice the amount of memory bandwidth per FLOP, leading to an earlier decline in scaling.
Equally important here: where do you draw the line. Real-world CFD codes do not behave like a roofline model, so there is usually some benefit to be had by adding more cores.
Is it ok to get 10% more performance by using twice the amount of cores? Then your limit might be 6-8 cores per channel.
Do you pay thousands of dollars each year for a limited per-core license? Then your limit will be closer to 2 cores per channel.

Quote:

Is it true that this ratio for an Intel CPU is 3 but for AMD counted as 4?

While there is a point to be made that AMDs most recent CPUs deliver better scaling than Intel CPUs, I think this assumption has different roots.
1. Prices and product segmentation
As long as I can remember, Intel has always demanded a hefty premium for each increase in core count. So choosing the right core count was more important, at least for price-sensitive buyers.
AMD on the other hand, especially with their Epyc lineup, delivered cheap cores. So why not just get the 24-core CPU instead of the 16-core, the price still seems reasonable.
2. licenses and per-core performance
At least during the days of first gen Epyc, Intel still had the lead in terms of per-core performance. This is the metric you are interested in when on an expensive per-core license. So my recommendations in this situation were still usually Intel, and with a lower amount of cores per memory channel. AMD Epyc on the other hand was the better choice for free and open source software. You can use as many cores as you want, so why not take a few more, they are cheap.

Quote:

How about the other parameters that affect this ratio such as:
Frequency of cores and the cell number of simulation.

CPU frequency changes the machine balance. The amount of FLOPS goes up, while memory bandwidth stays roughly the same. With high CPU core frequency, scaling tapers off quicker.
Cell count has little influence on the code balance. As long as the model is too large to fit mostly into CPU cache, the metric "cell updates per second" is almost unaffected by the total cell count.
Edit: somewhere through my ramblings, I forgot that this was about scaling. Cell count has an impact here. Lower cell count models tend to get less speedup with higher core counts. But this has less to do with memory, and is caused by the increasing overhead of parallelization. Plus, depending on the code, some serial sections left in it. This is why strong scaling (testing with constant total cell count) and weak scaling (testing with constant cell count per core) are a thing.

Habib-CFD · February 6, 2020, 19:49

Thank you for the clear and nice answer.

I listed some CPU's with highest performance/price for CFD application:

AMD TR 1920X-2920X (Affordable brand new <400$)
AMD EPYC 7351P to 7551P (Open source, used <700$)
Intel i9 9900X-10900X-10920X (License limited, <800$)
AMD EPYC 7302P-7402P (Brand new<1500$)
2X-AMD EPYC 7301 to 7401 (Used<1800$)
2X-AMD EPYC 7302-7402 (Brand new, Top end)

February 2, 2020, 22:54	Some fundamental questions on the hardware selection	#1
Habib-CFD Member Join Date: Oct 2019 Posts: 63 Rep Power: 6	Hi friends, Assuming the RAM indicates a compromise between latency and frequency e.g. CL14 and 3000 MHz and the cores made by the highest IPC (ZEN2 or Cascade Lake-X) with fixed frequency such as 3GHz. Let us ignore the OS compatibility and Math library. I am very interested to know, how many cores can feed by each memory channel in a CFD simulation? Is it true that this ratio for an Intel CPU is 3 but for AMD counted as 4? How about the other parameters that affect this ratio such as: Frequency of cores and the cell number of simulation. Thank you.

February 3, 2020, 02:27		#2
Simbelmynë Senior Member Join Date: May 2012 Posts: 546 Rep Power: 15	There is no direct answer to your question. If you look at the results in the benchmark thread then you see that most systems improve to approx. 3 cores / memory channel. However above 2 cores / memory channel there are serious diminishing returns, except for very high core counts where each individual core might have lower core frequency. There is no difference in Intel/AMD with regards to the above. Habib-CFD likes this.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Fundamental questions about numerical schemes	Obad	OpenFOAM Running, Solving & CFD	1	May 10, 2021 10:40
[ANSYS Meshing] Script Connection Friction	arpeedesign	ANSYS Meshing & Geometry	4	April 11, 2019 08:17
Dedicated thread for short questions and answers in the hardware subforum (sticky?)	flotus1	Site Help, Feedback & Discussions	1	July 15, 2017 18:51
Hardware selection for steady/unsteady incompressible, turbulent and cht simulations	maddalena	OpenFOAM	2	July 13, 2011 08:55
2 Fundamental CFD Questions regarding convergence	Jon	Main CFD Forum	0	September 24, 2005 20:47

February 6, 2020, 19:49		#4
Habib-CFD Member Join Date: Oct 2019 Posts: 63 Rep Power: 6	Thank you for the clear and nice answer. I listed some CPU's with highest performance/price for CFD application: AMD TR 1920X-2920X (Affordable brand new <400$) AMD EPYC 7351P to 7551P (Open source, used <700$) Intel i9 9900X-10900X-10920X (License limited, <800$) AMD EPYC 7302P-7402P (Brand new<1500$) 2X-AMD EPYC 7301 to 7401 (Used<1800$) 2X-AMD EPYC 7302-7402 (Brand new, Top end)