|
[Sponsors] |
2990wx falls far behind the dual way E5-2696V4 system in CFD simulations |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
December 21, 2018, 21:31 |
2990wx falls far behind the dual way E5-2696V4 system in CFD simulations
|
#1 |
New Member
Guangyu Zhu
Join Date: May 2013
Posts: 12
Rep Power: 13 |
The AMD platform is:
CPU: 2990WX (32 cores (64 threads)) RAM: 64GB of DDR4-2400 (16GB*4) SSD: INTEL 960P 512GB (NVME) GPU: GTX 1080Ti The Intel platform is: CPU: E5-2696V4*2 (22 cores (44 threads) per CPU) RAM: 128GB of DDR4-2400 RECC (16GB*8) SSB: SAMSUNG 860evo (SATA) GPU: GTX 1060Ti OS are Windows 10 Pro in both platforms. The 2990wx has 4 dies in the CPU, 2 of them (die 0, die 2 ) connect to the RAMs directly, and the other 2 dies access to the RAMs through die 0 and die 2 respectively. The review from pcworld indicated that the per-core bandwidth of 2990wx is only 2GB/s when all cores were used, an obvious delay of memory access would be expected in this situation. The core to core bandwidth of using 2 dies (16 cores) is 5GB/s, In my case, I utilized 16 cores (32 threads) to solve CFD cases (in SimVascular) of 3M and 5M elements on both platforms (32 threads), the Intel one is almost 3 times faster than the AMD platform. I tried to perform the simulations in UBUNTU 18.04 on AMD platform, still falls far behind the Intal one. What will help to improve the performance of the 2990wx platform? Will there be a boost improved performance if I use high-frequency RAM (DDR4-2666 OR 3600)? Or I should insert all the 8 DIMMs with ram? Suggestions are appreciated! ----------------- link to PCWORLD's review https://www.pcworld.com/article/3298...rformance.html |
|
December 22, 2018, 09:14 |
|
#2 |
Senior Member
Robert
Join Date: Jun 2010
Posts: 117
Rep Power: 16 |
The Intel system has at least twice the bandwidth of the AMD plus all the CPUs are direct linked to the memory.
It is unclear why you expect a sub $2000 dollar processor to be competitive with 2x $4000 dollar processors. For CFD you should be using an Epyc processor as it has a much higher bandwidth memory system and a better internal architecture to use that system. Threadrippers are for high CPU/low memory tasks and CFD isn't one of those. In terms of what you can do. Turn off hyperthreading as it is rarely helpful in CFD applications Set core affinity for the processes so the code actually uses the cores attached to the memory and the threads use the cache more efficiently. On the Intel system if the BIOS has a cache mode you can try changing that. Anandtech found 'local' iirc was 20% better on openfoam. |
|
December 22, 2018, 10:16 |
|
#3 | |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,400
Rep Power: 47 |
First things first: even when configured correctly the TR2990WX will be much slower than the dual-Intel system in parallel CFD.
What you need to change to get better results:
It has been mentioned quite a few times on this forum: The TR2990WX is a pretty sub-optimal CPU especially for CFD and similar workloads. So I kind of hope that you bought it mainly for a different kind of application. Quote:
If this machine is mainly for CFD here is what I would recommend: Sell the CPU and RAM and get a TR2950X instead with some really fast RAM certified to run with TR CPUs. 4x16GB DDR4-3200 at the very least. Or if 5M cells is a typical problem size for you 4x8GB might be enough. |
||
December 23, 2018, 10:11 |
|
#4 | |
New Member
Guangyu Zhu
Join Date: May 2013
Posts: 12
Rep Power: 13 |
Quote:
Thanks for the suggestions! The AMD platform was initially built for some machine learning tasks. Occasionally I use it for CFD analysis, but I didn't realize the memory bandwidth's role until I read the posts in this forum. I traced the core utilize in windows 10, the latest update seems optimized the task distribution. when using 16 core, the cores on dies that have direct ram link were utilized. However, as you mentioned, the performance is very poor compared with the dual Xeon system. |
||
December 23, 2018, 10:26 |
|
#5 | |
New Member
Guangyu Zhu
Join Date: May 2013
Posts: 12
Rep Power: 13 |
Quote:
I turned off SMT and ran some simple benchmarks in SimVascular, the sweet point of 2990wx for a 5M elements model is 16 cores... any more core won't help. The simulations of this model take around 10GB ram. Under 16 cores, it took around 450s to finish the simulation in AMD machine, while the Intel one only took 180s. If using 22 cores in Intel platform, only around 100s were spend... As the 2990wx platform is mainly for the machine learning tasks, I'm considering a dual Epyc 7301 system. The OpenFoam benchmarks of this CPU is very impressive. The dual intel system is tooooo expensive. My current dual 2696v4 machine took me more than 6000 USD, despite the CPUs are second hand. Thank you very much! |
||
March 8, 2019, 23:30 |
|
#6 |
Member
Join Date: Nov 2011
Location: Czech Republic
Posts: 97
Rep Power: 14 |
For Windows OS it might be worth to try: https://youtu.be/M2LOMTpCtLA
|
|
Tags |
2990wx, amd, cfd |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
CFD simulation of hydrogen purification in pressure swing adsorption system | ANDE | FLUENT | 2 | July 4, 2018 03:24 |
CFD Online Celebrates 20 Years Online | jola | Site News & Announcements | 22 | January 31, 2015 00:30 |
Cool (yet useless) CFD simulations? | ximik | Main CFD Forum | 1 | July 31, 2014 05:11 |
CFX11 + Fortran compiler ? | Mohan | CFX | 20 | March 30, 2011 18:56 |
public CFD Code development | Heinz Wilkening | Main CFD Forum | 38 | March 5, 1999 11:44 |