CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

OpenFOAM benchmarks on various hardware

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree495Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   October 27, 2022, 06:08
Default
  #581
New Member
 
Yannick
Join Date: May 2018
Posts: 12
Rep Power: 8
ym92 is on a distinguished road
Not much difference for dual 64-core setup compared to 64 cores on Epyc Rome (as can be expected). Wanted to post these results already for a while but better late than never..

Hardware: 2x AMD Epyc 7742, 16x32GB DDR4-3200
Software: Ubuntu 20.04, OpenFOAM v1812

Quote:
# cores Wall time (s):
------------------------
1 | 936.35
2 | 521.5
4 | 236.56
6 | 158.72
8 | 120.83
12 | 77.94
16 | 57.41
20 | 46.4
24 | 39.23
48 | 22.79
64 | 19.02
126 | 15.46
DVSoares likes this.
ym92 is offline   Reply With Quote

Old   October 27, 2022, 16:09
Default
  #582
New Member
 
Prince Edward Island
Join Date: May 2021
Posts: 26
Rep Power: 5
hami11 is on a distinguished road
So I could expect a roughly 20% speedup from 2x 7452 vs 2x7742 config? That is about the same as the speedup from rome to milan.
hami11 is offline   Reply With Quote

Old   October 28, 2022, 02:32
Default
  #583
New Member
 
Yannick
Join Date: May 2018
Posts: 12
Rep Power: 8
ym92 is on a distinguished road
Quote:
Originally Posted by hami11 View Post
So I could expect a roughly 20% speedup from 2x 7452 vs 2x7742 config? That is about the same as the speedup from rome to milan.

Hmm, I don't know if there is a benchmark here for 2x7452 here, but I would expect a 2x7452 to perform better than 2x 7742 with 64 cores in use. Therefore, the gain would probably be less than 20%. Anyway, I would rather discourage you from buying 2x 7742 unless you get them for the same price as 2x 7452 or you have other use for it. The memory bandwidth is still the same for both CPUs. But I think there are other people in this forum that have more experience .
ym92 is offline   Reply With Quote

Old   November 2, 2022, 03:30
Default 2x EPYC 7573X
  #584
Member
 
Join Date: Sep 2010
Location: Leipzig, Germany
Posts: 93
Rep Power: 15
oswald is on a distinguished road
Hardware: 2x EPYC 7573X, 16x 32GB DDR4
Software: Ubuntu 20.04.3, OF7

Code:
cores   Wall time (s)
1    492.5
4    113.53
8    57.91
12    39.68
16    31.88
20    28.08
24    25.14
28    24.14
32    22.34
40    21.49
48    17.17
56    12.53
64    11.55
I did not use core bindings, which might explain the bad scaling behaviour when using 20 to 40 cores. Compared to my 2xEPYC7543 workstation, this machine is ~33% faster on 64 cores.
DVSoares, flotus1 and gpouliasis like this.

Last edited by oswald; November 2, 2022 at 03:33. Reason: corrected typo
oswald is offline   Reply With Quote

Old   November 2, 2022, 04:51
Default
  #585
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,400
Rep Power: 47
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
That's awesome!
If you don't mind, could you run the 64-thread case once more? Since you are the first person here with Milan-X, I would really like to know where the limits are.

first clear caches by running this command with root privileges:
echo 3 > /proc/sys/vm/drop_caches

Then change to run directory for 64 threads and run the solver once more with:
mpirun -np 64 --report-bindings --bind-to core --rank-by core --map-by numa simpleFoam -parallel > log.simpleFoam 2>&1
linuxguy123 and gpouliasis like this.
flotus1 is offline   Reply With Quote

Old   November 2, 2022, 07:58
Default
  #586
Member
 
Join Date: Sep 2010
Location: Leipzig, Germany
Posts: 93
Rep Power: 15
oswald is on a distinguished road
Of course, no problem. With this I get an execution time of 11.3s.


Do you have a recommendation regarding NUMA-setup in the BIOS?
flotus1 likes this.
oswald is offline   Reply With Quote

Old   November 2, 2022, 09:15
Default
  #587
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,400
Rep Power: 47
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Just the usual: NPS=4 -> results in a total of 8 NUMA nodes
Or if it's available, depending on your motherboard: ACPI SRAT L3 cache as NUMA domain -> results in 16 NUMA nodes.

The latter is really only good for software like OpenFOAM. And other solvers that can also be run on distributed memory systems. For most other software, it will do more harm than good.
flotus1 is offline   Reply With Quote

Old   November 3, 2022, 03:56
Default
  #588
Member
 
Join Date: Sep 2010
Location: Leipzig, Germany
Posts: 93
Rep Power: 15
oswald is on a distinguished road
Thank you for the hints regarding the NUMA stuff. Here is the results with "ACPI SRAT L3 Cache as NUMA" enabled:
Code:
#cores Wall time (s)
 1      476.71
 4      111.69
 8      53.96
12      38.12
16      28.58
20      26.73
24      23.51
28      20.43
32      18.05
40      17.71
48      15.34
56      12.26
64      11.22
As the workstation is in principle only used for OpenFOAM, I will let this option turned on for now.
DVSoares and flotus1 like this.

Last edited by oswald; November 3, 2022 at 03:57. Reason: Typo
oswald is offline   Reply With Quote

Old   November 3, 2022, 04:56
Question 7532 vs 7542 for 20M mesh
  #589
New Member
 
mllokloks
Join Date: Feb 2018
Posts: 2
Rep Power: 0
mllokloks is on a distinguished road
I've been looking the results posted above. It seems that EPYC 7532 perform slightly better than EPYC 7542.

I'm planning to build a workstation for OpenFOAM, mostly using simpleFoam/pisoFoam/pimpleFoam with ~20M mesh. Should I go for 7532 instead of 7542?

Do the results posted here still applicable to ~20M mesh?

Thank you guys


7542:
Cache: 128MB
Base Clock: 2.9GHz

7532:
Cache: 256MB
Base Clock: 2.4GHz
mllokloks is offline   Reply With Quote

Old   November 4, 2022, 07:00
Default
  #590
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,400
Rep Power: 47
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Quote:
Do the results posted here still applicable to ~20M mesh?
Yes, they do. The mesh used here is large enough so the results are also applicable to much higher cell counts.

Base clock speeds for these CPUs don't really tell us much about actual performance. Even with semi-decent cooling, they will operate at higher clock speeds 24/7. And CPU clock speeds don't translate 1:1 into lower solver times, because 32 cores on 8 memory channels are not entirely compute-bound.
The Epyc 7532 is the faster CPU for OpenFOAM/CFD compared to the Epyc 7542, thanks to twice the amount of L3 cache.
linuxguy123 likes this.
flotus1 is offline   Reply With Quote

Old   November 4, 2022, 18:06
Default
  #591
Senior Member
 
Join Date: Oct 2011
Posts: 242
Rep Power: 16
naffrancois is on a distinguished road
Hello,

The creator of this thread has gone for a long time now. There is now a factor 3 between first gen EPYC 7501/7601 and this last CPU, which is quite impressive ! Would there be any chance to compile some of the results in an updatable chart, as was the original plan ? I understand this would require a lot of work but maybe by only keeping main CPU's from each generation ? This would help, I think, to get a clearer idea and redirect people asking for a new configuration ?
naffrancois is offline   Reply With Quote

Old   November 4, 2022, 18:58
Default
  #592
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,400
Rep Power: 47
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
If *someone* were to compile such a chart, I would be happy to include it in the first post.
flotus1 is offline   Reply With Quote

Old   November 6, 2022, 07:40
Default
  #593
Senior Member
 
Join Date: Oct 2011
Posts: 242
Rep Power: 16
naffrancois is on a distinguished road
ok, if you think it can be useful I can give it a try. I don't promise anything though
naffrancois is offline   Reply With Quote

Old   November 10, 2022, 15:47
Default Epyc Genoa benchmarks released
  #594
Member
 
dab bence
Join Date: Mar 2013
Posts: 47
Rep Power: 13
danbence is on a distinguished road
On OpenFOAM 10 drivaerFastback, the EPYC 9554 2P (192 cores) was 49% faster than the Epyc 7773X 2P (128 cores), so the performance has scaled with the core count. Impressive.

https://www.phoronix.com/review/amd-...-benchmarks/14
danbence is offline   Reply With Quote

Old   November 10, 2022, 16:54
Default
  #595
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,400
Rep Power: 47
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
9554 should be a 64-core part.
It's the 96-core 9654 that has a 49% advantage in this benchmark. It's only a 28% increase for the 64-core comparison.
Or if we want to leave 3D v-cache out of the picture for now and compare 64-core SKUs:
2P 7763: 277s
2P 9554: 130s
Which falls in line with the 2.2x increase in theoretical memory bandwidth.
Epyc Genoa with 3D v-cache will follow in the first half of 2023.

But the usual disclaimer still applies: openbenchmarking.org
I don't trust the OpenFOAM numbers there.
flotus1 is offline   Reply With Quote

Old   November 11, 2022, 18:01
Default
  #596
Senior Member
 
Join Date: Oct 2011
Posts: 242
Rep Power: 16
naffrancois is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
If *someone* were to compile such a chart, I would be happy to include it in the first post.
I compiled the results published so far, took the best or the one with the most complete description when duplicated configurations were available. I set them into two charts, max perf and single core perf. If you have some other filters that may be interesting let me know, it is all stored in an excel file. Thank you all for the contribution, it is great to gather so many results and get the big picture among all these cpus
Attached Images
File Type: jpg maxperf.jpg (39.8 KB, 103 views)
File Type: jpg singlecore.jpg (67.2 KB, 86 views)
oswald, Malinator, Jeggi and 3 others like this.

Last edited by naffrancois; November 12, 2022 at 16:23.
naffrancois is offline   Reply With Quote

Old   November 11, 2022, 18:05
Default
  #597
Senior Member
 
Join Date: Oct 2011
Posts: 242
Rep Power: 16
naffrancois is on a distinguished road
here's the excel if anyone wants to have a look or modify (unfortunately I realized too late some members already started a similar data base)
Attached Files
File Type: xlsx bdd_cpu_cfdonline.xlsx (39.1 KB, 55 views)
oswald, techtuner and Crowdion like this.

Last edited by naffrancois; November 13, 2022 at 02:31.
naffrancois is offline   Reply With Quote

Old   November 12, 2022, 20:59
Default
  #598
New Member
 
DS
Join Date: Jan 2022
Posts: 13
Rep Power: 4
Crowdion is on a distinguished road
Quote:
Originally Posted by naffrancois View Post
here's the excel if anyone wants to have a look or modify (unfortunately I realized too late some members already started a similar data base)

Thanks for taking the time to create the spreadsheet !

Small clarification: my systems Cisco C460 M4 (message number 544) has 4 x E7-8880 V3 (not 4 x E7-8880) CPUs.

Installed RAM: 32 x 16Gb DDR4-2133, CISCO UCS-MR-1X162RU-A. The RAMs runs at 1600MHz, however, the servers motherboard has installed "Jordan Creek 2" scallable memory buffer interface, which enable to double the memory channels bandwidth up to 3200MT/s (actual bandwidth measured using intel mlc 3.9a is 320 GB/s). It is important that at least 32 RAM slots be filled with 4 processors installed.
It seems that at the moment the Cisco C460 M4 is one of the most productive systems in this topic evaluating "it/s/$".
naffrancois likes this.
Crowdion is online now   Reply With Quote

Old   November 13, 2022, 02:28
Default
  #599
Senior Member
 
Join Date: Oct 2011
Posts: 242
Rep Power: 16
naffrancois is on a distinguished road
Quote:
Originally Posted by Crowdion View Post
Thanks for taking the time to create the spreadsheet !

Small clarification: my systems Cisco C460 M4 (message number 544) has 4 x E7-8880 V3 (not 4 x E7-8880) CPUs.

Installed RAM: 32 x 16Gb DDR4-2133, CISCO UCS-MR-1X162RU-A. The RAMs runs at 1600MHz, however, the servers motherboard has installed "Jordan Creek 2" scallable memory buffer interface, which enable to double the memory channels bandwidth up to 3200MT/s (actual bandwidth measured using intel mlc 3.9a is 320 GB/s). It is important that at least 32 RAM slots be filled with 4 processors installed.
It seems that at the moment the Cisco C460 M4 is one of the most productive systems in this topic evaluating "it/s/$".
Thanks for the explanation, I will add your remarks to the spreadsheet. I would have liked to add the cost as another indication. But unfortunately I have no idea how to manage this as prices evolve a lot in time and space.
naffrancois is offline   Reply With Quote

Old   November 13, 2022, 15:09
Default
  #600
New Member
 
Prince Edward Island
Join Date: May 2021
Posts: 26
Rep Power: 5
hami11 is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
9554 should be a 64-core part.
It's the 96-core 9654 that has a 49% advantage in this benchmark. It's only a 28% increase for the 64-core comparison.
Or if we want to leave 3D v-cache out of the picture for now and compare 64-core SKUs:
2P 7763: 277s
2P 9554: 130s
Which falls in line with the 2.2x increase in theoretical memory bandwidth.
Epyc Genoa with 3D v-cache will follow in the first half of 2023.

But the usual disclaimer still applies: openbenchmarking.org
I don't trust the OpenFOAM numbers there.
So it looks like genoa is 2x faster than rome on a core for core basis?
hami11 is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology wyldckat OpenFOAM 17 November 10, 2017 15:54
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days joegi.geo OpenFOAM Announcements from Other Sources 0 October 1, 2016 19:20
OpenFOAM Training Beijing 22-26 Aug 2016 cfd.direct OpenFOAM Announcements from Other Sources 0 May 3, 2016 04:57
New OpenFOAM Forum Structure jola OpenFOAM 2 October 19, 2011 06:55
Hardware for OpenFOAM LES LijieNPIC Hardware 0 November 8, 2010 09:54


All times are GMT -4. The time now is 14:20.