CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

2990wx falls far behind the dual way E5-2696V4 system in CFD simulations

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree1Likes
  • 1 Post By RobertB

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   December 21, 2018, 21:31
Question 2990wx falls far behind the dual way E5-2696V4 system in CFD simulations
  #1
New Member
 
Guangyu Zhu
Join Date: May 2013
Posts: 12
Rep Power: 13
bravebear is on a distinguished road
The AMD platform is:
CPU: 2990WX (32 cores (64 threads))
RAM: 64GB of DDR4-2400 (16GB*4)
SSD: INTEL 960P 512GB (NVME)
GPU: GTX 1080Ti

The Intel platform is:
CPU: E5-2696V4*2 (22 cores (44 threads) per CPU)
RAM: 128GB of DDR4-2400 RECC (16GB*8)
SSB: SAMSUNG 860evo (SATA)
GPU: GTX 1060Ti

OS are Windows 10 Pro in both platforms.

The 2990wx has 4 dies in the CPU, 2 of them (die 0, die 2 ) connect to the RAMs directly, and the other 2 dies access to the RAMs through die 0 and die 2 respectively. The review from pcworld indicated that the per-core bandwidth of 2990wx is only 2GB/s when all cores were used, an obvious delay of memory access would be expected in this situation. The core to core bandwidth of using 2 dies (16 cores) is 5GB/s,

In my case, I utilized 16 cores (32 threads) to solve CFD cases (in SimVascular) of 3M and 5M elements on both platforms (32 threads), the Intel one is almost 3 times faster than the AMD platform. I tried to perform the simulations in UBUNTU 18.04 on AMD platform, still falls far behind the Intal one.

What will help to improve the performance of the 2990wx platform? Will there be a boost improved performance if I use high-frequency RAM (DDR4-2666 OR 3600)? Or I should insert all the 8 DIMMs with ram?

Suggestions are appreciated!

-----------------
link to PCWORLD's review
https://www.pcworld.com/article/3298...rformance.html
bravebear is offline   Reply With Quote

Old   December 22, 2018, 09:14
Default
  #2
Senior Member
 
Robert
Join Date: Jun 2010
Posts: 117
Rep Power: 17
RobertB is on a distinguished road
The Intel system has at least twice the bandwidth of the AMD plus all the CPUs are direct linked to the memory.

It is unclear why you expect a sub $2000 dollar processor to be competitive with 2x $4000 dollar processors.

For CFD you should be using an Epyc processor as it has a much higher bandwidth memory system and a better internal architecture to use that system. Threadrippers are for high CPU/low memory tasks and CFD isn't one of those.

In terms of what you can do.

Turn off hyperthreading as it is rarely helpful in CFD applications

Set core affinity for the processes so the code actually uses the cores attached to the memory and the threads use the cache more efficiently.

On the Intel system if the BIOS has a cache mode you can try changing that. Anandtech found 'local' iirc was 20% better on openfoam.
HyperNova likes this.
RobertB is offline   Reply With Quote

Old   December 22, 2018, 10:16
Default
  #3
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,412
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
First things first: even when configured correctly the TR2990WX will be much slower than the dual-Intel system in parallel CFD.

What you need to change to get better results:
  • disable SMT
  • disable the 2 dies that have no direct memory path
  • -OR- pin your CFD code to the 16 cores on dies with a memory controller
  • tweak memory speed and timings.

It has been mentioned quite a few times on this forum: The TR2990WX is a pretty sub-optimal CPU especially for CFD and similar workloads. So I kind of hope that you bought it mainly for a different kind of application.

Quote:
Will there be a boost improved performance if I use high-frequency RAM (DDR4-2666 OR 3600)? Or I should insert all the 8 DIMMs with ram?
Faster memory definitely helps. Filling all 8 slots on the other hand will only decrease the maximum memory frequency you can reach and thus limit performance.
If this machine is mainly for CFD here is what I would recommend: Sell the CPU and RAM and get a TR2950X instead with some really fast RAM certified to run with TR CPUs. 4x16GB DDR4-3200 at the very least. Or if 5M cells is a typical problem size for you 4x8GB might be enough.
flotus1 is offline   Reply With Quote

Old   December 23, 2018, 10:11
Default
  #4
New Member
 
Guangyu Zhu
Join Date: May 2013
Posts: 12
Rep Power: 13
bravebear is on a distinguished road
Quote:
Originally Posted by RobertB View Post
The Intel system has at least twice the bandwidth of the AMD plus all the CPUs are direct linked to the memory.

It is unclear why you expect a sub $2000 dollar processor to be competitive with 2x $4000 dollar processors.

For CFD you should be using an Epyc processor as it has a much higher bandwidth memory system and a better internal architecture to use that system. Threadrippers are for high CPU/low memory tasks and CFD isn't one of those.

In terms of what you can do.

Turn off hyperthreading as it is rarely helpful in CFD applications

Set core affinity for the processes so the code actually uses the cores attached to the memory and the threads use the cache more efficiently.

On the Intel system if the BIOS has a cache mode you can try changing that. Anandtech found 'local' iirc was 20% better on openfoam.
Hi, Robert

Thanks for the suggestions! The AMD platform was initially built for some machine learning tasks. Occasionally I use it for CFD analysis, but I didn't realize the memory bandwidth's role until I read the posts in this forum.

I traced the core utilize in windows 10, the latest update seems optimized the task distribution. when using 16 core, the cores on dies that have direct ram link were utilized. However, as you mentioned, the performance is very poor compared with the dual Xeon system.
bravebear is offline   Reply With Quote

Old   December 23, 2018, 10:26
Default
  #5
New Member
 
Guangyu Zhu
Join Date: May 2013
Posts: 12
Rep Power: 13
bravebear is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
First things first: even when configured correctly the TR2990WX will be much slower than the dual-Intel system in parallel CFD.

What you need to change to get better results:
  • disable SMT
  • disable the 2 dies that have no direct memory path
  • -OR- pin your CFD code to the 16 cores on dies with a memory controller
  • tweak memory speed and timings.

It has been mentioned quite a few times on this forum: The TR2990WX is a pretty sub-optimal CPU especially for CFD and similar workloads. So I kind of hope that you bought it mainly for a different kind of application.


Faster memory definitely helps. Filling all 8 slots on the other hand will only decrease the maximum memory frequency you can reach and thus limit performance.
If this machine is mainly for CFD here is what I would recommend: Sell the CPU and RAM and get a TR2950X instead with some really fast RAM certified to run with TR CPUs. 4x16GB DDR4-3200 at the very least. Or if 5M cells is a typical problem size for you 4x8GB might be enough.
Hi, Alex

I turned off SMT and ran some simple benchmarks in SimVascular, the sweet point of 2990wx for a 5M elements model is 16 cores... any more core won't help. The simulations of this model take around 10GB ram. Under 16 cores, it took around 450s to finish the simulation in AMD machine, while the Intel one only took 180s. If using 22 cores in Intel platform, only around 100s were spend...

As the 2990wx platform is mainly for the machine learning tasks, I'm considering a dual Epyc 7301 system. The OpenFoam benchmarks of this CPU is very impressive. The dual intel system is tooooo expensive. My current dual 2696v4 machine took me more than 6000 USD, despite the CPUs are second hand.

Thank you very much!
bravebear is offline   Reply With Quote

Old   March 8, 2019, 23:30
Default
  #6
Member
 
Join Date: Nov 2011
Location: Czech Republic
Posts: 97
Rep Power: 14
Sixkillers is on a distinguished road
For Windows OS it might be worth to try: https://youtu.be/M2LOMTpCtLA
Sixkillers is offline   Reply With Quote

Reply

Tags
2990wx, amd, cfd

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
CFD simulation of hydrogen purification in pressure swing adsorption system ANDE FLUENT 2 July 4, 2018 03:24
CFD Online Celebrates 20 Years Online jola Site News & Announcements 22 January 31, 2015 00:30
Cool (yet useless) CFD simulations? ximik Main CFD Forum 1 July 31, 2014 05:11
CFX11 + Fortran compiler ? Mohan CFX 20 March 30, 2011 18:56
public CFD Code development Heinz Wilkening Main CFD Forum 38 March 5, 1999 11:44


All times are GMT -4. The time now is 17:28.