CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

OpenFOAM benchmarks on various hardware

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree144Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   May 29, 2020, 07:08
Default
  #281
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 427
Rep Power: 12
Simbelmynė is on a distinguished road
Quote:
Originally Posted by blackcatxiii View Post
Hi everyone

I have collected the benchmark data posted in this thread in an Excel spreadsheet and plotted out walltime as well as speed up for comparison.

I hope it is helpful for those who plan to build a new PC for OpenFOAM.

Nice, thank you. It would also be interesting to have a column for memory speed and rank, but I guess that is out of the question right now seeing that the work to do so is quite massive.



What makes me most interested when looking at this is the extremely impressive results of Ryzen 3900X. It manages the same results as the Threadripper 1950X (and similar 4 channel setups), while it uses only half of the memory channels. It is very clear that Ryzen 3rd generation benefits immensely from tight timings on the memory.
Simbelmynė is offline   Reply With Quote

Old   June 8, 2020, 10:41
Default
  #282
New Member
 
Join Date: Apr 2020
Posts: 2
Rep Power: 0
pred is on a distinguished road
HPE DL385 GEN10 Plus 2*EPYC 7542 32Core, 16*32GB 3200MHz Memory
Nothing optimized - just set Bios to HPC and installed Centos 8 with Openfoam 5.0

# cores Wall time (s):
------------------------
1 600.7
2 348.89
4 152.27
6 101.78
8 76.09
12 53.49
16 40.26
20 36.63
24 31.95
28 29.98
32 27
36 26.45
40 25.85
44 25.09
48 23.27
52 23.47
56 23.46
60 22.5
64 21.96
sida likes this.
pred is offline   Reply With Quote

Old   June 9, 2020, 03:16
Default AMD Epyc 7542 256gb Ram
  #283
New Member
 
Giovanni Medici
Join Date: Mar 2014
Posts: 25
Rep Power: 8
giovanni.medici is on a distinguished road
Hi there,
we ran the benchmark on a similar setup as @pred, and found pretty similar results:
  • 2x AMD Epyc 7542
  • PowerEdge R6525
  • 256gb RAM 3200Mhz

Ubuntu server

Code:
# cores   Wall time (s):
------------------------
1 724.16
2 346.29
4 165.72
6 107.43
8 82.14
12 55.02
16 41.32
20 37.03
24 33.5
32 26.79
48 22.99
64 21.5
giovanni.medici is offline   Reply With Quote

Old   June 9, 2020, 14:06
Default Intel Xeon Gold 6140
  #284
New Member
 
Federico Zabaleta
Join Date: May 2016
Posts: 19
Rep Power: 6
fedez91 is on a distinguished road
2*Intel Xeon Gold 6140
96gb (12*8gb) 2666 MHz


Code:
#cores   Wall time (s):
------------------------
1            981.72
2            488.92
4            217.97
6            146.36
8            113.38
12           85.34
16           68.23
20           60.27
24           55.94
28           52.5
32           50.76
36           49.87
Improvement seems to become insignificant after 24 cores... any idea why this may happen?

Last edited by fedez91; June 9, 2020 at 15:35.
fedez91 is offline   Reply With Quote

Old   June 9, 2020, 16:17
Default
  #285
Senior Member
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 2,503
Rep Power: 35
flotus1 will become famous soon enoughflotus1 will become famous soon enough
Quote:
Improvement seems to become insignificant after 24 cores... any idea why this may happen?
Same old story: running out of memory bandwidth.
__________________
Please do not send me CFD-related questions via PM
flotus1 is offline   Reply With Quote

Old   June 9, 2020, 18:05
Default
  #286
New Member
 
Federico Zabaleta
Join Date: May 2016
Posts: 19
Rep Power: 6
fedez91 is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
Same old story: running out of memory bandwidth.
Any way of improving this or it just means that I got 12 extra cores that are basically useless? Ot this would change with bigger meshes? I apologise if my questions are naive, but I do not a really understand the details of how cpu works.

Thank you for your help!

Last edited by fedez91; June 9, 2020 at 19:20.
fedez91 is offline   Reply With Quote

Old   June 9, 2020, 20:20
Default
  #287
Senior Member
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 2,503
Rep Power: 35
flotus1 will become famous soon enoughflotus1 will become famous soon enough
There is not much you can do about it, once you purchased the hardware. Higher cell counts won't improve this behavior. It is a function of code balance, which does not change drastically with cell count.
With Intel Xeon CPUs, you can try to enable "cluster on die" mode in the bios. Might also be called "sub-NUMA cluster" with this generation. That should improve latency and bandwidth a bit for NUMA-aware software like OpenFOAM. Don't expect huge improvements though. https://en.wikichip.org/wiki/intel/m...UMA_Clustering
And it might be a good idea to check whether the DIMMs are populated correctly, so each memory channel of each CPU has one DIMM. You should be able to look up the correct way in your server/workstation/motherboard manual, and then compare it to what you got.
At least the additional cores are not entirely wasted, there is still a small speedup. And contrary to commercial CFD solvers, you don't have to buy additional licenses to use them.
fedez91 and sida like this.
__________________
Please do not send me CFD-related questions via PM
flotus1 is offline   Reply With Quote

Old   June 10, 2020, 10:31
Default help with workstation configuration
  #288
New Member
 
sida
Join Date: Dec 2019
Posts: 6
Rep Power: 2
sida is on a distinguished road
I feel lucky after reading this thread and before making my decision.

With 2500$ budget,( previously going to spend on Threadripper 3970x and its cooling system) which alternative CPUs do you suggest?

I'm can't decide between one AMD EPYC 7452 or 2X EPYC 7302???

Thanks in advance

Last edited by sida; June 10, 2020 at 16:05.
sida is offline   Reply With Quote

Old   June 10, 2020, 12:36
Default
  #289
New Member
 
Federico Zabaleta
Join Date: May 2016
Posts: 19
Rep Power: 6
fedez91 is on a distinguished road
Thanks flotus1. I will enable 'cluster on die' and re-run the test. I will post the results if I see improvements. Thanks again!
fedez91 is offline   Reply With Quote

Old   June 15, 2020, 00:25
Default
  #290
Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 36
Rep Power: 8
wkernkamp is on a distinguished road
Quote:
Originally Posted by sida View Post
I feel lucky after reading this thread and before making my decision.

With 2500$ budget,( previously going to spend on Threadripper 3970x and its cooling system) which alternative CPUs do you suggest?

I'm can't decide between one AMD EPYC 7452 or 2X EPYC 7302???

Thanks in advance
The 2x 7302 will be faster, because it has twice the memory channels of the single 7452.
sida likes this.
wkernkamp is offline   Reply With Quote

Old   June 19, 2020, 08:34
Default
  #291
Senior Member
 
linnemann's Avatar
 
Niels Nielsen
Join Date: Mar 2009
Location: NJ - Denmark
Posts: 523
Rep Power: 23
linnemann will become famous soon enough
Tested on 2xEPYC Rome 7302, 256Gb ram 32x8Gb@2933
No core binding or trickery.

With AOCC 2.1/GCC 9.2.0 and Openfoam 19.12

Code:
# cores	Wall time (s)		
	AOCC 2.1.0        GCC 9.2.0 
        -march=znver2     -march=znver2    Diff %
1	693.3            692.5	            0%
2	470.3            470.88	            0%
4	167.2            164.52	            -2%
8	78.5             77.16	            -2%
12	59.5             60.26	            1%
16	42.3             41.79	            -1%
20	41.2             41.07	            0%
24	33.3             33.59	            1%
28	34.0             32.36	            -5%
32	28.2             27.95	            -1%
So no real benefit going for AOCC/Clang
__________________
Linnemann

PS. I do not do personal support, so please post in the forums.

Last edited by linnemann; June 24, 2020 at 05:58.
linnemann is offline   Reply With Quote

Old   June 26, 2020, 17:29
Default
  #292
New Member
 
FW
Join Date: Mar 2018
Posts: 5
Rep Power: 4
Fabian2602 is on a distinguished road
Results on a Ryzen 7 3700X with overclocked Memory (3800 - CL 16).

Code:
# cores Wall time (s)
1 857.26
2 380.49
4 253.1
6 219.94
8 212.16
Fabian2602 is offline   Reply With Quote

Old   June 26, 2020, 20:11
Default
  #293
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 427
Rep Power: 12
Simbelmynė is on a distinguished road
Quote:
Originally Posted by Fabian2602 View Post
Results on a Ryzen 7 3700X with overclocked Memory (3800 - CL 16).

Code:
# cores Wall time (s)
1 857.26
2 380.49
4 253.1
6 219.94
8 212.16

I think you can do much better. Have you tried Ryzen DRAM calculator? I only managed 3600 CL16 @ 1:1 infinity fabric, but with the secondary and tertiary timings set properly I got around 170 s in this test on my Ryzen 3700X system.
Simbelmynė is offline   Reply With Quote

Old   June 26, 2020, 23:59
Default
  #294
New Member
 
FW
Join Date: Mar 2018
Posts: 5
Rep Power: 4
Fabian2602 is on a distinguished road
Okay that's a huge difference. I have chosen the values from a thread in the computerbase forum. The if-fabrik is 1:1 for me as well. I will check today.
Fabian2602 is offline   Reply With Quote

Old   June 27, 2020, 04:15
Default
  #295
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 427
Rep Power: 12
Simbelmynė is on a distinguished road
You can download the dram calculator here.


There are so many settings so it is better if you just try the calculator yourself rather than me posting them all here. Good luck!
Simbelmynė is offline   Reply With Quote

Old   July 11, 2020, 09:03
Default
  #296
New Member
 
Francisco
Join Date: Sep 2018
Location: Portugal
Posts: 14
Rep Power: 3
ships26 is on a distinguished road
For anyone considering an older build, I ran this benchmark with 2 x E5645. Mind you, it is openfoam4.1 on debian Jessie (Kernel 3.16):

Code:
# cores   Wall time (s):
------------------------
1 1546.8
2 859.93
4 414.82
8 309.71
10 297.45
12 295.6
I'm not being able to check the memory specs atm, but I'll edit this post if I manage to.
My guess is that the tri channel memory might holding it back from a better 10-12-thread scaling.

Last edited by ships26; July 19, 2020 at 18:19. Reason: There are two E5645s, not one.
ships26 is offline   Reply With Quote

Old   July 11, 2020, 09:06
Default
  #297
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 427
Rep Power: 12
Simbelmynė is on a distinguished road
Quote:
Originally Posted by ships26 View Post
For anyone considering an older build, I ran this benchmark on an E5645. Mind you, it is openfoam4.1 on debian Jessie (Kernel 3.16):

Code:
# cores   Wall time (s):
------------------------
1 1546.8
2 859.93
4 414.82
8 309.71
10 297.45
12 295.6
I'm not being able to check the memory specs atm, but I'll edit this post if I manage to.
My guess is that the tri channel memory might holding it back from a better 10-12-thread scaling.

If you have sudo rights then you might be able to find out using:



Code:
dmidecode -t 17
Simbelmynė is offline   Reply With Quote

Old   July 11, 2020, 09:11
Default
  #298
New Member
 
Francisco
Join Date: Sep 2018
Location: Portugal
Posts: 14
Rep Power: 3
ships26 is on a distinguished road
Unfortunately I don't
I tried it without sudo, and it only gives me the info that there's 50GB, nothing about frequencies. I can try to find a way around it, though.

Last edited by ships26; July 11, 2020 at 09:13. Reason: grammar
ships26 is offline   Reply With Quote

Old   July 14, 2020, 15:44
Default Doing as well as you can ships26
  #299
Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 36
Rep Power: 8
wkernkamp is on a distinguished road
Quote:
Originally Posted by ships26 View Post
For anyone considering an older build, I ran this benchmark on an E5645. Mind you, it is openfoam4.1 on debian Jessie (Kernel 3.16):

Code:
# cores   Wall time (s):
------------------------
1 1546.8
2 859.93
4 414.82
8 309.71
10 297.45
12 295.6
I'm not being able to check the memory specs atm, but I'll edit this post if I manage to.
My guess is that the tri channel memory might holding it back from a better 10-12-thread scaling.

My results with the faster X5670 processors:




2xX5675 3.07ghz 6 cores per cpu

Meshing Times:
1 1998.08
2 1313.22
4 719.71
6 558.17
8 466.22
12 449.43
Flow Calculation:
1 1322.84
2 787.4
4 375.77
6 305.44
8 286.3
12 278.02


Looks to me like you are doing about as well as you can with the setup.
ships26 likes this.
wkernkamp is offline   Reply With Quote

Old   July 15, 2020, 18:25
Default
  #300
New Member
 
Francisco
Join Date: Sep 2018
Location: Portugal
Posts: 14
Rep Power: 3
ships26 is on a distinguished road
Thank you for sharing your results, wkernkamp! It's always great to have a point of comparison.



Did you stick to a single cpu in hyperthreading when running the benchmark with 12 threads?
ships26 is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology wyldckat OpenFOAM 17 November 10, 2017 15:54
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days joegi.geo OpenFOAM Announcements from Other Sources 0 October 1, 2016 19:20
OpenFOAM Training Beijing 22-26 Aug 2016 cfd.direct OpenFOAM Announcements from Other Sources 0 May 3, 2016 04:57
New OpenFOAM Forum Structure jola OpenFOAM 2 October 19, 2011 06:55
Hardware for OpenFOAM LES LijieNPIC Hardware 0 November 8, 2010 09:54


All times are GMT -4. The time now is 04:37.