CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

OpenFOAM benchmarks on various hardware

Register Blogs Community New Posts Updated Threads Search

Like Tree492Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   March 24, 2020, 12:28
Default
  #261
Member
 
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 6
Kailee71 is on a distinguished road
Quote:
Originally Posted by kstuart View Post
Dell R820 4x E5-4640 2.6ghz 16x4gb PC312800 This was my first run, I'm hoping with some work it can go better. This cost me $520 shipped.

# cores Wall time (s):
------------------------
<SNIP>

16 85.26
18 85.96
20 77.75
22 78.91
24 71.71
26 75.19
28 69.97
32 72.9
Wow this seems excellent value for money. Quick question though - according to wikipedia the E5-4640 SR0JK has 8 cores @2.4GHz, the v2 SR19R has 10 cores @2.2GHz - which are you using?

Cheers,

Kai.
Kailee71 is offline   Reply With Quote

Old   March 27, 2020, 14:24
Default
  #262
Senior Member
 
Josh McCraney
Join Date: Jun 2018
Posts: 220
Rep Power: 8
joshmccraney is on a distinguished road
Ubuntu 18.04 OF6. EPYC 7281 8 slots of MEM-DR416L-CL07-ER26 16GB DDR4 2666 RDIMM Server Memory RAM.

16 core: 90.1 seconds for 3 time average.
joshmccraney is offline   Reply With Quote

Old   March 30, 2020, 03:45
Default
  #263
New Member
 
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6
kstuart is on a distinguished road
Quote:
Originally Posted by Kailee71 View Post
Wow this seems excellent value for money. Quick question though - according to wikipedia the E5-4640 SR0JK has 8 cores @2.4GHz, the v2 SR19R has 10 cores @2.2GHz - which are you using?

Cheers,

Kai.



This is the V1 - 2.4ghz stuff. It's running in turbo up to 2.6ghz. I am pretty happy with it. Of course, a week after I got it setup and started running some Forte cases, I find out the school has a 400 core cluster of V3 and v4 stuff.
kstuart is offline   Reply With Quote

Old   April 2, 2020, 19:06
Default ryzen 9 3950x
  #264
Member
 
Join Date: Sep 2013
Posts: 46
Rep Power: 12
ma-tri-x is on a distinguished road
Hi!


Ryzen 9 3950x, 16(32) x 4.2GHz, 2x16GB DDR4-3200, ubuntu 18.04 LTS, m2 ssd with min 1900MB/s direct read/write


costs ~ 1800€


Memory bandwidth seems saturated after ~ 8 threads



Code:
# cores   Wall time (s):
------------------------
1 649.56
2 355.39
4 219.56
6 198.86
8 190.2
12 189.75
16 190.12
20 191.28
24 194.41
---------- edit:


Hi! Since I struggled a lot to make the sources run on a foam-extend-4.0 build, I created a case that should basically work on every OpenFoam or foam-extend build:
https://github.com/ma-tri-x/setup_ubuntu


Also, when I ran this, the above values transformed to
# cores Wall time (s):
#------------------------
1 774.7
2 457.09
4 258.68
6 229.71
8 212.6
12 208.46
16 208.63
20 211.86
24 214.47


The top values were done with OF-7 precompiled for ubuntu.
The bottom values were done with self-compiled foam-extend-4.0, g++-5

Last edited by ma-tri-x; April 3, 2020 at 19:15. Reason: values not true anymore, link for base case
ma-tri-x is offline   Reply With Quote

Old   April 4, 2020, 13:40
Default diagram
  #265
Member
 
Join Date: Sep 2013
Posts: 46
Rep Power: 12
ma-tri-x is on a distinguished road
Hi everyone!


CPUs behave weirdly. Here's an example where an 8core i7-9900 seems to beat a 16core ryzen when using 30 threads. What's wrong here?


best,
M
https://owncloud.gwdg.de/index.php/s/784OYnGXClzydtJ


edit :----------------------------


This made me execute the testcase on the ryzen up to 112 threads. It works. With massive speedup. No end in sight.

edit: ---------------------------------------


FATAL: Sorry, I was posting too fast. "ExecutionTime" doesn't match real execution time, when threads > cores+hyperthreading. My timer confirms: CLOCKTIME is the one to go for. Changed the script. Behaviour as expected. No magic happening.
nsf likes this.

Last edited by ma-tri-x; April 4, 2020 at 17:35. Reason: falsified
ma-tri-x is offline   Reply With Quote

Old   April 13, 2020, 11:46
Default
  #266
New Member
 
Join Date: Apr 2020
Posts: 2
Rep Power: 0
pred is on a distinguished road
#DL380 Gen9: 2* 12 Core Xeon E5-2687W v4 @ 3GHz, 8*32GB 2400MHz Memory (4-Channel)
# cores Wall time (s):
------------------------
1 880.71
2 476.69
4 222.56
6 156.74
8 122.75
12 96.56
16 83.21
20 77.61
24 74.21


#DL380 Gen10: 2* 12 Core Xeon Gold 6146 @3.2GHz 12*16GB 2666MHz Memory (6-Channel)
# cores Wall time (s):
------------------------
1 889.47
2 431.06
4 191.42
6 128.94
8 101.64
12 77.48
16 64.62
20 58.06
24 54.68


Next month we should get our new Epyc (2*7542) based DL385 Gen10plus servers.
Looking forward on how fast they are compared to the Intel based ones.
wkernkamp and superkelle like this.
pred is offline   Reply With Quote

Old   April 16, 2020, 11:58
Default
  #267
Member
 
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 6
superkelle is on a distinguished road
I run a Ryzen 2700X eight core CPU @ stock, 32Gb RAM @ 3600MHz 17-19-19-39, Ubuntu 19.10, OF1912 (but nearly same results with OF7)
Code:
# cores   Wall time (s):
------------------------
1 823
2 525.13
4 352.57
6 330.33
8 330.59
the results seem to be a little bit slow, does anyone has some suggestion?
superkelle is offline   Reply With Quote

Old   April 16, 2020, 12:49
Default
  #268
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
That's a pretty hefty overclock on the memory. Since single-core results look great, but scaling doesn't, there are a few things I would check
  • CPU throttling when using many threads
  • memory actually running in dual-channel mode
  • memory overclock is applied correctly. Please don't take this the wrong way, but I have seen plenty of people buying fast memory, and then leave it at auto settings.
  • tighten up other timings too, with help from Ryzen DRAM calculator. Especially when running high memory speeds with tight primary timings, the secondary and tertiary timings can suffer when left at auto
  • and as usual, make sure there are no other tasks running that eat up resources
joshmccraney and superkelle like this.
flotus1 is offline   Reply With Quote

Old   April 16, 2020, 15:30
Default
  #269
Member
 
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 6
superkelle is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
That's a pretty hefty overclock on the memory. Since single-core results look great, but scaling doesn't, there are a few things I would check
  • CPU throttling when using many threads
  • memory actually running in dual-channel mode
  • memory overclock is applied correctly. Please don't take this the wrong way, but I have seen plenty of people buying fast memory, and then leave it at auto settings.
  • tighten up other timings too, with help from Ryzen DRAM calculator. Especially when running high memory speeds with tight primary timings, the secondary and tertiary timings can suffer when left at auto
  • and as usual, make sure there are no other tasks running that eat up resources
Thank you. Yes you were right the memory settings were reset. It ran at 2133 MHz.

New results with Ryzen 2700X eight core CPU @ stock, 32Gb RAM (4 x 8Gb in dual channel) @ 3200MHz 17-19-19-39, Ubuntu 19.10, OF1912
Code:
# cores   Wall time (s):
------------------------
1 719.94
2 428.66
4 270.53
6 238.9
8 232.88
flotus1 likes this.
superkelle is offline   Reply With Quote

Old   April 17, 2020, 10:53
Default
  #270
Member
 
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 6
superkelle is on a distinguished road
I wonder how I can use the "virtual cores" that are available by SMT. In my case it does not automaticaly run those. I also tried to use the option "--use-hwthread-cpus" for mpirun:

Code:
mpirun --use-hwthread-cpus -np 12 simpleFoam -parallel
I am aware that the results should be even worse, but I am still curious.
superkelle is offline   Reply With Quote

Old   April 17, 2020, 11:37
Default
  #271
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
What seems to be the problem? As soon as you use -np >8 on your 8-core CPU, mpirun should have no other choice than oversubscribing cores with more than one thread.
flotus1 is offline   Reply With Quote

Old   April 17, 2020, 12:47
Default
  #272
Member
 
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 6
superkelle is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
What seems to be the problem? As soon as you use -np >8 on your 8-core CPU, mpirun should have no other choice than oversubscribing cores with more than one thread.
I get following error message:

Code:
mpirun -np 12 simpleFoam -parallel | tee log.simpleFoam
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 12 slots
that were requested by the application:
  simpleFoam

Either request fewer slots for your application, or make more slots available
for use.
--------------------------------------------------------------------------
superkelle is offline   Reply With Quote

Old   April 17, 2020, 13:17
Default
  #273
Member
 
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 6
superkelle is on a distinguished road
Quote:
Originally Posted by superkelle View Post
I wonder how I can use the "virtual cores" that are available by SMT. In my case it does not automaticaly run those. I also tried to use the option "--use-hwthread-cpus" for mpirun:

Code:
mpirun --use-hwthread-cpus -np 12 simpleFoam -parallel
I am aware that the results should be even worse, but I am still curious.
*EDIT:
Code:
mpirun --use-hwthread-cpus -np 12 simpleFoam -parallel
works now fine, sry for the trouble, there was a mistake in the copy of the 0.orig files
superkelle is offline   Reply With Quote

Old   April 18, 2020, 09:33
Post A bit of nostalgia...
  #274
Member
 
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 6
Kailee71 is on a distinguished road
Just for kicks, tried it on an X8DAE, 2x X5670 (2.93, turbo 3.2 I think), 12x4Gb 1333MHz RDRAM, Ubuntu 19.10, kernel 5.3.0, OpenFOAM7 from openfoam.org repository.

Code:
CPUs      Mesh      Speedup        Runtime      Speedup         It/s
1         2305         1.00        1251.24         1.00         0.08
2         1508         1.53         717.89         1.74         0.14
4          887         2.60         334.84         3.74         0.30
6          625         3.69         274.43         4.56         0.36
8          515         4.48         252.89         4.95         0.40
12         417         5.53         241.46         5.18         0.41

No this was not a VM...


Cheers,


Kai.
superkelle likes this.
Kailee71 is offline   Reply With Quote

Old   April 19, 2020, 07:05
Default
  #275
Member
 
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 6
superkelle is on a distinguished road
Update with optimised memory timings:

Ryzen 2700X eight core CPU @ stock, 32Gb RAM (4 x 8Gb in dual channel) @ 3200MHz 16-18-20-36 (+ optimises subsettings by DRAM Calculator for Ryzen v1.7.0 by 1usmus), Ubuntu 19.10, OF1912
Code:
# cores   Wall time (s):
------------------------
1 699.57
2 414.2
4 257.01
6 227.03
8 221.71
********************
12 216.02
16 223.53
More than 6 threads seem to be not beneficial. Additionaly like assumed, 12 and 16 threaded are not really faster, but for sake of interest I included the results.
superkelle is offline   Reply With Quote

Old   April 20, 2020, 11:04
Default
  #276
Member
 
alexander thierfelder
Join Date: Dec 2019
Posts: 71
Rep Power: 6
superkelle is on a distinguished road
So a little comparison graph for different memory configurations, but be aware that I have done every run only one time, it is really no quantitative statement behind a single run. I did not change any of the latency setting so every run was done on:

Ryzen 2700X eight core CPU @ stock, RAM @ 16-18-20-36 (+ optimises subsettings by DRAM Calculator for Ryzen v1.7.0 by 1usmus), Ubuntu 19.10, OF1912
Attached Images
File Type: png benchmarks_motorbike_2M_R2700x.png (54.7 KB, 144 views)
flotus1, wkernkamp and Kailee71 like this.
superkelle is offline   Reply With Quote

Old   May 23, 2020, 11:57
Default Some more nostalgia...
  #277
Member
 
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 6
Kailee71 is on a distinguished road
HP Z840, 2x Xeon E5-2637v4, 2x4 cores, 3.5GHz, 128Gb 2400Mhz in 8x16Gb, single rank.


Code:
Threads   Mesh    Speedup Runtime Speedup It/s
1         1581    1       1048.83 1.00    0.1
2         1042    1.52    525.38  1.99    0.19
4         570     2.77    224.17  4.68    0.45
6         400     3.95    159.36  6.58    0.63
8         332     4.76    133.11  7.88    0.75
Does someone have an explanation for the super-linear speedups on the runtimes?


Those cpus still have 4 memory channels, which usually form the bottleneck but might be underused by 1 thread/memory channel, so just out of curiosity...

Code:
Threads   Mesh    Speedup Runtime Speedup It/s
10        449     3.52    157.40  6.66    .64
12        376     4.20    146.59  7.15    .68
14        354     4.47    130.91  8.01    .76
16        337     4.69    125.14  8.38    .80
... Interesting ... But still not worth using threads vs. cores.

K.
wkernkamp likes this.
Kailee71 is offline   Reply With Quote

Old   May 26, 2020, 12:20
Default Virtualisation comparison
  #278
Member
 
Kailee
Join Date: Dec 2019
Posts: 35
Rep Power: 6
Kailee71 is on a distinguished road
Hi all,

on the same hardware as in #274, comparing the bare-metal performance to virtualisation using bhyve and esxi 6.5. In each case, vm is ubuntu 20.04, openfoam7 from openfoam.org repository, vms have 24Gib ram, installed and run on local ssd storage.

Bhyve:
Code:
CPUs      Mesh      Speedup        Runtime      Speedup         It/s
1         2617         1.00        1676.16         1.00         0.06
2         1708         1.53         866.58         1.93         0.12
4         1020         2.57         502.86         3.33         0.20
6          710         3.69         399.49         4.20         0.25
8          592         4.42         354.14         4.73         0.28
12         481         5.44         306.78         5.46         0.33
ESXi 6.5 (only 8 cores per vm available due to license):
Code:
CPUs      Mesh      Speedup        Runtime      Speedup         It/s
1         2509         1.00        1500.86         1.00         0.07
2         1665         1.51         847.07         1.77         0.12
4          965         2.60         425.78         3.52         0.23
6          683         3.67         365.86         4.10         0.27
8          565         4.44         320.61         4.68         0.31
Therefore, in it/s....
Code:
Cores  Bare Metal  Bhyve   ESXi
1        0.08      0.06    0.07
2        0.14      0.12    0.12
4        0.30      0.20    0.23
6        0.36      0.25    0.27
8        0.40      0.28    0.31
12       0.41      0.33
Any interest in Hyper-V numbers?

Cheers,

Kai.
wkernkamp likes this.
Kailee71 is offline   Reply With Quote

Old   May 27, 2020, 09:43
Smile All benchmarking data from this thread
  #279
New Member
 
Aoo
Join Date: Jun 2015
Posts: 1
Rep Power: 0
blackcatxiii is on a distinguished road
Hi everyone

I have collected the benchmark data posted in this thread in an Excel spreadsheet and plotted out walltime as well as speed up for comparison.

I hope it is helpful for those who plan to build a new PC for OpenFOAM.
Attached Files
File Type: xlsx openfoam benchmark.xlsx (54.6 KB, 208 views)
oswald, wkernkamp, ErikAdr and 4 others like this.
blackcatxiii is offline   Reply With Quote

Old   May 28, 2020, 14:43
Default Nice Spreadsheet
  #280
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 316
Rep Power: 12
wkernkamp is on a distinguished road
Thanks for the spreadsheet. I liked the idea of coloring the result by column.
wkernkamp is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology wyldckat OpenFOAM 17 November 10, 2017 15:54
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days joegi.geo OpenFOAM Announcements from Other Sources 0 October 1, 2016 19:20
OpenFOAM Training Beijing 22-26 Aug 2016 cfd.direct OpenFOAM Announcements from Other Sources 0 May 3, 2016 04:57
New OpenFOAM Forum Structure jola OpenFOAM 2 October 19, 2011 06:55
Hardware for OpenFOAM LES LijieNPIC Hardware 0 November 8, 2010 09:54


All times are GMT -4. The time now is 12:35.