|
[Sponsors] |
![]() |
![]() |
#1 |
New Member
Tarık Yaman
Join Date: Dec 2021
Location: Turkey
Posts: 9
Rep Power: 3 ![]() |
Hello Everyone,
We have determined 3 processors to be used in our workstations. It will be used as dual cpu configuration on our workstation. Software : (Ansys Cfx, Fluent, Mechanical) There are no problem about hpc license. These are : -Xeon Gold 2nd Gen 5218R -Xeon Gold 2nd Gen 5220R -Xeon Gold 2nd Gen 6230RIn order to use these processors at full performance, that is, to avoid bottlenecks, we need to load optimal memory. The 6230R model processor supports up to 6 channels of DDR4-2993 memory type. (Max Memory Size 1TB). 5220R and 5218R model processors support up to 6 channels in DDR4-2667 memory type. (Max Memory Size 1TB). My first question: We will use two CPUs in our workstation. If we use two CPUs, can we multiply the max memory size by 2? So if we have 2TB ram on the motherboard, can we use it efficiently? My second question: If there are 16 ram sockets on the motherboard and all of them have ram sticks, will there be octa-channel performance from rams? Will this be good or bad? There are some calculations on the internet about therotical maximum memory bandwith and processor compatibility. For example ; Memory (12 Unit 64 GB 2933 MHz DDR4 ECC) Memory Type : DDR4 Frequency : 2933 MegaHertz Channel Number : 6 (Hexa Channel) 2933*8(Bit Number)*6 = 140800 MB/s = 140 GB/s Cpu Bus Speed : 10.4 GT/s (Intel Ultra Path Interconnect) (Gold 6230R) 10.4 GT/s = 83.2 GB/s My third question: Do we need to multiply the Bus speed by 2 when we are going to use dual processors? When we multiply, will the memory bandwith (140 GB/s) be insufficient since 83.2*2 = 166.4 GB/s? My fourth question: If there are 16 ram slots on the motherboard and if all of them are filled, if the number of channels is 8, will the ram bandwith 186.6 GB/s? Thank you. |
|
![]() |
![]() |
![]() |
![]() |
#2 | |||||
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,343
Rep Power: 45 ![]() ![]() |
Quote:
No, 2TB of memory can not be used effectively with the CPUs you picked. They have 6-channel memory controllers, there is no way to get a balanced memory population of 2TB on 12 memory channels. Quote:
Best case: you only fill the correct 12 slots and have a performance impact that is barely measurable. It stems from the trace layout of the DIMM slots. Worst case: You fill all 16 DIMMs and end up with weird and confusing performance issues. Side-note here on socket interconnects: Quote:
But it doesn't matter a whole lot anyway: that's non-uniform memory access, which should be avoided anyway for memory sensitive applications. With the system configured correctly (no socket interleaving), Fluent and CFX mostly access local memory, with very limited traffic between sockets. Quote:
Two CPUs effectively double the available memory bandwidth. Quote:
With that out of the way: you are obviously putting a lot of thought into this. So let me give you some pointers beyond your original questions. 1) IF you are determined to use one of these CPUs, get a compatible motherboard with 12 DIMM slots. And fill each of the slots with identical DIMMs. E.g. 12x16GB, 12x32GB, 12x64GB... 2) Also check the memory support section of your motherboard. I am reading between the lines that you want to use A LOT of memory, in the TB range. You probably need LRDIMM for that. May I ask why you need so much memory? 3) There are better CPUs available for your intended application, from both Intel and AMD. Intel has their newer "Ice Lake" CPUs which have 8-channel DDR4-3200 memory controllers. And AMD has Epyc "Milan", "Milan-X" (with larger L3 caches, perfect for CFD and FEA) as well as the latest Epyc "Genoa" which support 12-channel DDR5-4800. The latter is quite expensive though, especially if you need a lot of RAM. A side-effect: all of these CPUs support much larger memory capacity than the "Cascade Lake" CPUs you picked. |
||||||
![]() |
![]() |
![]() |
![]() |
#3 | ||
New Member
Tarık Yaman
Join Date: Dec 2021
Location: Turkey
Posts: 9
Rep Power: 3 ![]() |
Thank you for your respond, Alex.
Quote:
Many websites say at least 2GB of ram per 1 million cells and at least 8Gb of memory for each core. (see: https://www.ansys.com/blog/hardware-...ate-simulation) But the current situation seems to be using 12*64 Gb (768 Gb) memory. We do not want to get an insufficient memory error. Quote:
-- The reason why I pay attention so much on memory bandwidth and processor compatibility: I have a workstation of my own. 2 X Xeon E5 2682 V4 and 8*32 GB 2400 Mhz Memory. (Totally 32 Core&64 Thread | 256 GB ram (Quad Channel per Cpu)). While running analysis in cfx, I get the fastest result when I set the partition value to 16. When I set it to 64 it almost doubles the time. Ansys offical web-site say : Selecting a processor with the highest number of cores is usually not recommended because it can negatively affect memory bandwidth if the CPU memory isn’t increased along with the core count. A large number of cores may decrease the performance of CFX, Fluent and LS-DYNA, which usually run on large clusters. End- |
|||
![]() |
![]() |
![]() |
![]() |
#4 | ||
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,343
Rep Power: 45 ![]() ![]() |
No need to explain the importance of memory bandwidth to me
![]() Recommendations for "amount of memory per core" can be ignored. Maybe that was important 20 years ago. But all you need is enough memory to fit your largest models. Quote:
12500$ for parts will get you much better hardware than paying the same amount for an OEM workstation, just to state the obvious. If you buy from an OEM or SI, and you need tons of memory, you can make a lot of room in your budget buy getting a minimal memory configuration, and upgrading RAM yourself. DDR4 is ridiculously cheap these days. Less than 3€/GB for DDR4-3200 reg ECC https://geizhals.eu/samsung-rdimm-64...7.html?hloc=de OEMs will charge much more than that. Either way, I would still recommend you get more recent CPUs. Either Intel Xeon "Ice Lake" (Xeon Gold 63xx) or AMD Epyc "Milan" (Epyc 7xx3). They both have 8-channel DDR4-3200 memory controlles, and have newer/faster cores than Cascade Lake. Which CPUs exactly can be squeezed into your budget depends on where you buy. Quote:
|
|||
![]() |
![]() |
![]() |
![]() |
#5 | |
New Member
Tarık Yaman
Join Date: Dec 2021
Location: Turkey
Posts: 9
Rep Power: 3 ![]() |
Quote:
1 * Amd Epyc 7763 & 8*64 GB 3200 Mhz DDR4 What do you think of the above configuration? Can we use all cores with full performance? Isn't the memory bandwith insufficient according to the processor? Will this cause a bottleneck if all cores are used? |
||
![]() |
![]() |
![]() |
![]() |
#6 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,343
Rep Power: 45 ![]() ![]() |
64 cores is too much for these CPUs. 2x32 cores would be much better. Even 2x24 cores would be preferable.
|
|
![]() |
![]() |
![]() |
![]() |
#7 |
New Member
Tarık Yaman
Join Date: Dec 2021
Location: Turkey
Posts: 9
Rep Power: 3 ![]() |
Dear Flotus1, your advice is very valuable to us.
Thanks to your advice, we are looking for new processors. Especially 4th gen amd processors. (9004series) 4th gen AMD processors stand out from other competitors with their core frequencies, cache amounts, and the number and type of memory channels they support. We created some configs as our budget allows. We'd love to hear from you and other readers (if any, I don't think so). 2 X 9174F & 24 X 32 GB DDR5 4800 MHZ 2 X 9274F & 24 X 32 GB DDR5 4800 MHZ 2 X 9254 & 24 X 32 GB DDR5 4800 MHZ **[Why is the 9274F more expensive than the 9174F? Everything is the same, just a slight change in core frequencies.] In addition, why and how important is the l3 cache value? Can you feel a very serious difference between 128 and 256 MB? |
|
![]() |
![]() |
![]() |
![]() |
#8 | |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,343
Rep Power: 45 ![]() ![]() |
I recently wrote a buyers guide specifically for AMD Genoa: AMD Epyc 9004 "Genoa" buyers guide for CFD
If your budget allows it, I would definitely recommend the versions with 256MB of L3 cache, i.e. 8 active CCDs. Before dropping down to the lower tier SKUs with 128MB L3, I think it is better to look for discounted higher-end Milan systems. Quote:
Nothing we need to worry about, just avoid the 9174F. While they are the latest and greatest, Genoa with a lot of RAM will be way over your original budget. No problem if you can stretch your budget this far. A good compromise is Milan-X with the increased L3 cache. The 32-core 7573X is kind of good value these days at less than 4000€ retail. And you can use cheap DDR4 memory with it. |
||
![]() |
![]() |
![]() |
![]() |
#9 |
New Member
Tarık Yaman
Join Date: Dec 2021
Location: Turkey
Posts: 9
Rep Power: 3 ![]() |
We decided to buy a system with Dual 75F3 (2*32 Core, 2*64 Thread). We will install 16 * 64 GB on this system. We preferred the 75F3 instead of the 74F3 due to its budget convenience. Although the number of cores has increased, our base frequency has decreased. The cache sizes are the same.
Hopefully, we can effectively use all the cores and not experience bottlenecks. Dear Flotus, thank you for all your advice. I will share the benchmark results with you and if there are any tests you want, I will share their results with you. |
|
![]() |
![]() |
![]() |
![]() |
#10 |
Senior Member
Erik
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 1,154
Rep Power: 22 ![]() |
Flotus1 Mentioned this already, but it may have been overlooked. When you ran 64 threads for the benchmark, you are using twice the amount of physical cores you have. You do not want to work on virtual cores. It would be beneficial to disable hyperthreading in the BIOS so the OS only uses the 32 physical cores. This is called something else in the BIOS for AMD chips (SMT).
|
|
![]() |
![]() |
![]() |
![]() |
#11 | |
Senior Member
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 16 ![]() |
Hi all,
thanks for the interesting discussion. Flotus I have a question: Quote:
|
||
![]() |
![]() |
![]() |
![]() |
#12 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,343
Rep Power: 45 ![]() ![]() |
75F3: 4600€ https://geizhals.eu/amd-epyc-75f3-10...-a2491883.html
7573X: 3700€ https://geizhals.eu/amd-epyc-7573x-1...-a2697336.html Even if it was the other way round due to regional prices or availability, the 7573X would still be worth it. At the high end, factoring in all platform costs, it is still the best price/performance CPU. Yes, the huge L3 caches make the 7573X the faster CPU for CFD and FEA. By how much exactly depends on the case. But definitely enough to warrant 1000€ more per CPU, if you are shopping in that price region anyway. |
|
![]() |
![]() |
![]() |
Tags |
bandwith, cpu memory |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Any ideas on the Penalty for dual CPU and infiniband | JoshuaB | Hardware | 3 | July 3, 2018 13:00 |
Superlinear speedup in OpenFOAM 13 | msrinath80 | OpenFOAM Running, Solving & CFD | 18 | March 3, 2015 05:36 |
Star cd es-ice solver error | ernarasimman | STAR-CD | 2 | September 12, 2014 00:01 |
Dual cpu workstation VS 2 node cluster single cpu workstation | Verdi | Hardware | 18 | September 2, 2013 03:09 |
New workstation for different usage scenarios - CPU and RAM | natem | Hardware | 6 | August 7, 2013 02:47 |