CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   Hardware (https://www.cfd-online.com/Forums/hardware/)
-   -   Xeon workstation: suggestions needed (https://www.cfd-online.com/Forums/hardware/172148-xeon-workstation-suggestions-needed.html)

bindesboll August 8, 2016 08:18

Quote:

Originally Posted by digitalmg (Post 612005)
Dear Kim
Would you please explain why choosing Xeon E5-2667 V4 is the waste of memory bandwidth ?
In that CPU, maximum allowable memory bandwidth is 76.8 GB/s for 8 cores. then we would have 9.6 GB/s for each core when is fully utilized. That is near the bandwidth of most DDR4 RAM modules in the market.
Please explain more.
Best Regards

The waste of memory bandwidth is related to whether the bandwidth can be utilized by the CPU or not. If CPU performance is low and memory bandwidth is hight, then you should consider choosing a faster CPU to utilize the memory bandwidth.

Specifying my present cluster the profesional advice was that the bandwidth of 1867 MHz DDR3 RAM matched E5-2667 v3 (octo-core). Assuming this is correct, then the ratio between CPU performance (fprate) and memory bandwidth (RAM speed) should be approx. that for the above combination.

2667 v3, fprate: 590 (dual processor), RAM speed: 1867 MHz, ratio (590/1867): 0.32

2667 V4, fprate: 724, RAM speed: 2400 MHz, ratio: 0.30
2687W v4, fprate: 888, RAM speed: 2400 MHz, ratio: 0.37
2683 V4, fprate: 933, RAM speed: 2400 MHz, ratio: 0.39

(Above assumes that the memory configuration are identical for all cases, except for the RAM frequency).

As it can be seen, the 2667 v4 has sligthly less ratio (<0.32), which means that the memory bandwidth migth not be fully utilized. Whereas the others have higher ratio, meaning that the CPUs most likely are not fully utilized, but could benefit from more memory bandwidth.

BR
Kim

Yanni August 8, 2016 19:39

Thanks for you help Robert!
After some reseach I found the Noctua NH-D15s, should do the job.
Core affinity: very interesting point. I'll try that once my system is setup.
Thanks!

digitalmg September 2, 2016 11:47

Dear Kim,
I cannot understand your idea for determining the memory bandwidth versus performance. 2667 v3 should compare to itself when you change the memory type not to 2667 v4 because fprate is affected by CPU performance too.

Another question which rises here is that if fprate results are applicable for most CFD cases, then E5-2699 V4 has the highest rating among 2600 family. So memory bandwidth per core ratio is totally wrong for this case. It has 22 slower cores with the same 76.8 GB/s bandwidth of it's family.
It's in an obvious contrast with using of HPC cluster ideology.

How do you interpret it ?

RobertB September 2, 2016 16:29

The 2699 scores 1090 and has 22 cores

A 2690 scores 922 and has 14 cores

In this case ~50% more cores give ~20% more performance.

The 2699 costs twice as much as the 2690 and would cost ~$4,000 per motherboard more.

The issue is that there is a price performance balance between absolute performance and absolute performance per dollar. Part of this will depend on the licensing scheme of the code you use as to whether threads are limited or not.

Additionally the use of fprate as a useful arbiter of relative performance may need to be examined as you go to the edges of CPU/memory bandwidth, as some of the SPEC applications may be less memory hungry than CFD which would boost the relative score.


All times are GMT -4. The time now is 20:39.