CFD Online Logo CFD Online URL
Home > Forums > General Forums > Hardware

Intel 6154 vs 2687w v2 (with a bit on 6146)

Register Blogs Members List Search Today's Posts Mark Forums Read

LinkBack Thread Tools Search this Thread Display Modes
Old   May 3, 2019, 12:54
Default Intel 6154 vs 2687w v2 (with a bit on 6146)
New Member
Joshua Brickel
Join Date: Nov 2013
Posts: 26
Rep Power: 12
JoshuaB is on a distinguished road
Okay, so Iíve gotten an opportunity to upgrade and thus test out some fancy hardware. Both a Dual Xeon Gold 6154 and a bit with a Dual Xeon Gold 6146 and compare that with the Xeon 2687w v2.

How I ran my tests:
Simulation with 2.7 million nodes.
  • I ran 14 cores on the 2687wV2 on the 6154 14-34 cores and on the 6146 on 22 cores.
  • OS: Windows10
  • Program: CFX R2019R1
  • I never ran at full core count as I still need to be able to interact with the machines.
The speed up for the same number of cores is impressive. See first attachment (by the way is there a way for me to insert this in-line?).

It can be shown that by considering the processors to be evenly used then Amdhalís formula as reported by Flotus1 does indeed predict speed up fairly accurately for 14 to 26 cores (7 to 13 cores/processor) (p =0.97 as suggested by Flotus1). I normalized the results to 7 cores per processor. (Note, I did not run fewer cores because of how long each iteration took).

This can be seen in the second attachment.

After 13 cores/processor the Amdhal formula significantly over predicts. I believe the reason for this is bandwidth limitations on the memory. Both NUMA nodes in the resource monitor report 90% usage when running on 26 cores (13 cores/processor) and 100% at 30 cores. Why things got a little better at 34 cores Iím not sure, my working hypothesis is since the system does every so often change which processors are being used and that takes some overhead. Perhaps the more cores being used the less it actually switches to other cores.

When I ran the 6146 at 22 cores I basically got the same results as for the 6154 at 22 cores despite the 6146 showing a 0.2 GHz faster clock rate on task manager. A bit strange, but perhaps there was some small difference in the setup of the computers.

So it would appear for this generation of intel processors that 14 cores/processor is about the maximum one should go with for memory intensive applications like CFD. And yes, Flotus is correct about the diminishing returns on the additional cores, but the cores do help till one hits the memory bandwidth of the NUMA nodes.

Finally one caveat to the above is the 6154 I tested was on a Dell Precision 7820. The 7820 has only one memory stick per channel. According to a paper published by Intel and Lenova the difference in performance between having 1 or 2 sticks per memory channel is only 3%. So I donít think this should make a huge difference, but if anyone can test on a 2 stick system I would be happy to hear about it. All memory channels on my machine are occupied.

I hope this is of some interest to the people on this forum. I don't do these benchmarks for a living, so take them with a grain of salt.
Attached Images
File Type: jpg 6152_Speed_up_over_2687wV2.JPG (26.6 KB, 38 views)
File Type: jpg Testing_vs_Amdhal.JPG (21.5 KB, 29 views)
JoshuaB is offline   Reply With Quote

Old   May 5, 2019, 11:08
Super Moderator
flotus1's Avatar
Join Date: Jun 2012
Location: Germany
Posts: 3,400
Rep Power: 47
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
I would like to emphasize that Amdahl's law is not ideal for modeling the performance of bandwidth-limited applications. It can give a fairly good fit on single nodes when the parallel portion is adjusted accordingly. But it is supposed to model something entirely different: code that is limited by a portion of serial execution. I used it in the context of a purchase guide to stress one point: don't go overboard with expensive high-core-count CPUs for CFD.
For bandwidth-limited applications, the roofline model would be more appropriate. But for real-world CFD codes it also has its shortcomings. It is over-simplified yet still difficult to use.
When applying a best fit of Amdahl's law to bandwidth-limited codes, your observation was to be expected: it under-estimates performance for low core counts and over-estimates it for higher core counts.

As usual, the reality lies somewhere in between...

(by the way is there a way for me to insert this in-line?).
You would have to use an image hosting service like imgur. Then you can copy the BBCode directly to your post.
flotus1 is offline   Reply With Quote


2687w, 6146, 6154, intel, xeon

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
[] Compile OpenFoam using Intel ICC on OpenLogic Centos 7.3 for Intel MPI and INFINIBAND kishoremg040 OpenFOAM Installation 1 May 6, 2018 13:21
[OpenFOAM] Color display problem to view OpenFOAM results. Sargam05 ParaView 16 May 11, 2013 00:10
CFX11 + Fortran compiler ? Mohan CFX 20 March 30, 2011 18:56
star-cd with linux in 64 bit intel prs. Ruzi Siemens 8 March 15, 2007 04:57
Which FOAM package should be downloaded for a Dell Xeon intel 64 bit processor preetham OpenFOAM Installation 2 February 1, 2006 10:08

All times are GMT -4. The time now is 18:56.