|
[Sponsors] |
![]() |
![]() |
#1 |
Member
Guy
Join Date: Jun 2019
Posts: 34
Rep Power: 5 ![]() |
Which memory setup would have higher bandwidth on a Naples EPYC machine - 2666 MHz 2 rank or 3200 MHz single rank ? Assume the motherboard will allow the memory to be over clocked.
What about on a Rome EPYC machine ? Thanks |
|
![]() |
![]() |
![]() |
![]() |
#2 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,306
Rep Power: 44 ![]() ![]() |
You can not run memory beyond the official spec with these platforms. Even if the motherboard allows you to enter a higher frequency, the CPUs themselves are locked. Maybe there are some ES/QS CPUs floating around with unlocked memory controllers, but I haven't heard of anyone pulling this off.
Since this locks us to DDR4-3200 for Rome, and DDR4-2666 for Naples, the only free parameter is memory ranks per channel. 2 is better than 1 for bandwidth. |
|
![]() |
![]() |
![]() |
![]() |
#3 | ||
Member
Guy
Join Date: Jun 2019
Posts: 34
Rep Power: 5 ![]() |
Quote:
Wendell of Level1 Tech overclocked the memory on a 7551 without any issues. It ran at 3200MHz even though the official spec on Naples is 2666MHz. Quote:
How much of a difference does memory rank make in the bandwidth ? |
|||
![]() |
![]() |
![]() |
![]() |
#4 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,306
Rep Power: 44 ![]() ![]() |
That's neat, do you have a link to the video/article?
The difference in maximum bandwidth is usually around 15% between 1 and 2 ranks per channel. |
|
![]() |
![]() |
![]() |
![]() |
#5 | |
Member
Guy
Join Date: Jun 2019
Posts: 34
Rep Power: 5 ![]() |
Quote:
https://www.youtube.com/watch?v=1ZwRYprMF0w @ 12:00 It turns out that Naples EPYC runs the memory fastest on 1 DIMM per channel of SINGLE rank memory. See Table 4 on page 8. https://developer.amd.com/wp-content.../56301_1.0.pdf 1 DIMM per channel of Dual Rank has a memory bandwidth of 154 GB/s. 1 DIMM per channel of Single Rank has a memory bandwidth of 170 GB/s. Last edited by linuxguy123; January 18, 2022 at 20:27. |
||
![]() |
![]() |
![]() |
![]() |
#6 | |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,306
Rep Power: 44 ![]() ![]() |
I'm not entirely convinced by that. He briefly mentions that the option is there in bios, and suggests that it will run at DDR4-3200. But that's never shown. I remain skeptical.
I have a modded bios on my Supermicro H11DSI that exposes tons of additional options. Higher memory frequency being one of them. I was never able to get it to work, and still haven't seen anyone else doing it. Professional overclockers trying to set world records could not do it. Maybe it's a different story on single-socket systems, I don't know about that. Now for the easy part: The PDF you linked shows two things: Epyc Naples has staggered memory frequency specs depending on the memory population. That's nothing new, the more ranks per channel, the lower the officially supported memory speed. I am still running DDR4-2666 with 2 ranks per channel on Epyc 7551. As do a lot of other people. The more important thing to note here is: the tables list maximum theoretical memory bandwidth. Calculated via (frequency x 64Bit x 2 x number of memory channels). The whole point here is that one rank per channel can not get close to that theoretical maximum, even in synthetic benchmarks like stream. You need at least 2 ranks per channel for that. Edit: they even state that right above those tables: Quote:
|
||
![]() |
![]() |
![]() |
![]() |
#7 | ||
Member
Guy
Join Date: Jun 2019
Posts: 34
Rep Power: 5 ![]() |
Yes, but that paper specifically states this:
Quote:
I think the quote you gave refers to computations on large data sets that benefit from keeping the entire dataset in RAM versus accessing a disk. Dual rank memory allows more memory per DIMM, all else being equal. Notice the description of that situation is labelled "memory intensive". CFD is bandwidth intensive, not memory intensive. Quote:
Last edited by linuxguy123; January 18, 2022 at 20:41. |
|||
![]() |
![]() |
![]() |
![]() |
#8 | |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,306
Rep Power: 44 ![]() ![]() |
Quote:
What's our goal here? Arguing for arguments sake? Your first post was about how much performance can be gained by actually overclocking memory on Naples. And now you want to let some obscure "official" spec prevent you from having your cake, and eating it too? Pick a side, mate ![]() Last edited by flotus1; January 19, 2022 at 06:49. |
||
![]() |
![]() |
![]() |
![]() |
#9 | ||
Member
Guy
Join Date: Jun 2019
Posts: 34
Rep Power: 5 ![]() |
Quote:
As do the other 9 Naples systems I bought for work. And the thousands of systems that were bought with such memory configurations, because neither the seller nor the buyer knew or cared about this rather obscure limitation. Quote:
Have you ever benchmarked the same machine back to back with single rank memory versus 2 rank memory ? |
|||
![]() |
![]() |
![]() |
![]() |
#10 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,306
Rep Power: 44 ![]() ![]() |
That's hardly a personal attack, and definitely was not intended as such. I just don't like it when these discussions keep running in circles, to a point where I have to start doubting the motivation.
It's just not a clear-cut technical decision. Either you are fine with running technically out of spec, or you are not. Then again, the obscure nature of this particular specification means that tons of people are violating it, without knowing about it, and without problems. No, I have not personally run the type of benchmark you suggest. |
|
![]() |
![]() |
![]() |
![]() |
#11 | |
Member
Guy
Join Date: Jun 2019
Posts: 34
Rep Power: 5 ![]() |
Quote:
Are you absolutely sure that your memory is running at 2666MHz ? Is there a POST message in the IPMI that states so ? All I want to know is what is faster, Single Rank or Dual Rank, so I can buy memory. AMD says Single Rank. If dual rank memory runs at 2666 MHz, they are probably equal. I'm guessing that not all Dual Rank memory runs at 2666MHz. What evidence do you have that says Dual Rank memory runs faster than Single Rank ? |
||
![]() |
![]() |
![]() |
![]() |
#12 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,306
Rep Power: 44 ![]() ![]() |
Yeah, I'm pretty sure about the memory frequency in my systems
![]() About that evidence... For lack of a better term, you will have to take my word for it. I have been doing this for quite a while now. Reading technical documentation, blog posts, news articles, benchmarks from professionals and amateurs, discussing with experts, and sporadically running my own benchmarks. What I didn't do is keep a database or list with links links to prove what I learned over the years. To wrap this up from my side: for maximum performance in CFD workloads, use dual-rank memory. One DIMM per channel. And if possible, a motherboard with one DIMM slot per channel. |
|
![]() |
![]() |
![]() |
![]() |
#13 |
Member
Guy
Join Date: Jun 2019
Posts: 34
Rep Power: 5 ![]() |
||
![]() |
![]() |
![]() |
![]() |
#14 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,306
Rep Power: 44 ![]() ![]() |
It is the opposite of how you interpreted what AMD states in their document. That's a difference.
Screenshot_20220119_214734.png Please note how the text does not mention ranks. And also note that the image caption reads "1 DIMM Per Channel SR/DR RDIMM or LRDIMM operating at 2666 MHz" This of course leaves the ambiguity between SR and DR. That's where I come in, sharing my expertise with you. Listen, this topic isn't some dubious bs that I convinced myself of. It is a relatively well-known fact among tech enthusiasts and experts. If you want a different opinion, you will have to ask someone who isn't me. |
|
![]() |
![]() |
![]() |
![]() |
#15 | |
Member
Guy
Join Date: Jun 2019
Posts: 34
Rep Power: 5 ![]() |
Quote:
According to that diagram, it doesn't matter if the memory is SR or DR. Which is what I said a few posts above. Worst case, as far as I can tell, there is no performance penalty for using single rank memory on memory bandwidth bound processes, such as CFD. |
||
![]() |
![]() |
![]() |
![]() |
#16 |
New Member
Francisco
Join Date: Sep 2018
Location: Portugal
Posts: 27
Rep Power: 6 ![]() |
Some food for thought: I'm very far from an expert on this topic, so don't ask me for many details on this, but I think that saying that 1R is exactly the same as 2R is not fully accurate.
There is at least one potential advantage of 2R at the same frequency, which is rank interleaving: https://en.wikipedia.org/wiki/Interleaved_memory. I have no idea how much of an improvement (if any) this would provide to an Epyc based build regarding CFD workloads in particular, or if this technology is even included in Epyc systems, but it could help @flotus's case. Maybe you already knew about this too. Still,I thought it could be a lead worth investigating. |
|
![]() |
![]() |
![]() |
![]() |
#17 |
Member
Guy
Join Date: Jun 2019
Posts: 34
Rep Power: 5 ![]() |
EPYC processors do not use interwoven memory. If they did, the dual rank and dual DIMM setups would be faster than the single rank.
|
|
![]() |
![]() |
![]() |
![]() |
#18 |
Member
Erik Andresen
Join Date: Feb 2016
Location: Denmark
Posts: 32
Rep Power: 9 ![]() |
At https://www.spec.org/cpu2017/results/ various computer brands present their systems for a fixed suite of testcases. They want their systems to perform well compared to the competition. The memory they use are (nearly) always either dual rank og quad rank. For Epyc's I think quad rank performs the best, but only with a slight edge to dual rank. Some years ago, I saw to equal Epyc systems (except for memory) where one system was with single rank and til other with dual rank. The dual rank system was about 5 to 10 % faster in some testcases. It was long ago, so I think it was a system with an Epyc 7??1.
|
|
![]() |
![]() |
![]() |
![]() |
#19 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,306
Rep Power: 44 ![]() ![]() |
Moved the discussion to its own thread, because it got bit too long and doesn't quite match the topic of the thread where it originated.
|
|
![]() |
![]() |
![]() |
![]() |
#20 |
Senior Member
Join Date: May 2012
Posts: 534
Rep Power: 15 ![]() |
While I agree with the general consensus in the community regarding rank2 vs rank1 explained by @flotus1, I think this is a bit muddy as well. In normal consumer (enthusiast) systems you will get vastly different memory results from memory kits that are similar on paper, due to the - sometimes - large difference in memory sub-timings applied by the motherboard and XMP.
As such, it is very difficult to make a straight up comparison. If you push the memory controller to the maximum, then I suspect that you will reach similar results regardless of configuration. This comes from the observation that you can usually run single rank memory in more than 10% higher frequency compared to dual rank (same goes for 1 kit per memory channel compared to 2 kits per memory channel). If two different kits (one single- and one dual rank) are tested at the same timings and frequency then I would put my money on that the dual rank system performs better. But as stated above, if both kits are pushed to the maximum, then I am not sure. Since server systems will not run above a certain memory frequency, this usually becomes a moot point, unless dual rank configurations on a specific motherboard forces the sub-timings to be really poor. Most server motherboard BIOSes will not give you any options to change sub-timings. Anyways, I have never benchmarked server systems myself, I have just listened to the collected wisdom and left the benchmarks to my consumer platform at home ![]() |
|
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Used Memory Accumulates During Course of Simulation Until interFoam gets Killed | Ship Designer | OpenFOAM Running, Solving & CFD | 5 | July 21, 2022 19:59 |
General recommendations for CFD hardware [WIP] | flotus1 | Hardware | 15 | March 21, 2022 12:11 |
4-core Workstation Builds | dominicafonso | Hardware | 9 | April 11, 2021 06:42 |
Epyc 7551 vs 6850K; Ansys Mechanical Bench | Duke711 | Hardware | 24 | March 26, 2020 10:16 |
AMD Epyc CFD benchmarks with Ansys Fluent | flotus1 | Hardware | 55 | November 12, 2018 05:33 |