CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

OpenFOAM benchmarks on various hardware

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree495Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   March 22, 2021, 12:44
Default
  #381
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,406
Rep Power: 47
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
I think you went on an unnecessary tangent here. The motherboard in your system should only have 16 DIMM slots. You stated earlier that you have 16x8GB of RAM installed. There is just no way to get an unbalanced memory population this way. Just open the side panel and check if all slots are populated. Then check with the operating system that 128GB of RAM are present. CPU-Z won't help you with dual-socket systems.

I am not sure what kind of virtualization you are running here. But if I had to guess, that's probably what is causing the performance hit.
flotus1 is offline   Reply With Quote

Old   March 22, 2021, 14:12
Default
  #382
New Member
 
Roland Siemons
Join Date: Mar 2021
Posts: 13
Rep Power: 5
RolandS is on a distinguished road
Quote:
Originally Posted by wkernkamp View Post
The windows program CPU-Z (free) will tell you the memory configuration under the memory tab. I think there is something big wrong in your memory set-up. It is not normal that your best result happens for 30 cores. Did that run complete normally?

Yes, Will, the runs complete normally.


I ran CPU-Z, but see no abnormalities. I attach the report file (zipped *.txt). It is big, but it can be searched for the term "memory". IF (only then) you would have time for it, you might skim it.



Thanks for your suggestions!


Roland
Attached Files
File Type: zip DESKTOP-5QM78MA.zip (29.0 KB, 6 views)
RolandS is offline   Reply With Quote

Old   March 22, 2021, 15:19
Default
  #383
New Member
 
George
Join Date: Jul 2020
Location: TU Delft, The Netherlands
Posts: 18
Rep Power: 5
gpouliasis is on a distinguished road
Quote:
Originally Posted by RolandS View Post
Yes, Will, the runs complete normally.


I ran CPU-Z, but see no abnormalities. I attach the report file (zipped *.txt). It is big, but it can be searched for the term "memory". IF (only then) you would have time for it, you might skim it.



Thanks for your suggestions!


Roland
Hey Roland,

I went through your hardware log. Everything seems fine. I would agree with flotus, a reasonable explanation is the windows subsystem that you use. For optimal results you should make a linux partition and work from there. It takes some time to set up, but not a lot.

I would also like to stress your attention in the results you provide. You are oversubscribing, that means that you assign in a single core more than one processes. That is very suboptimal. You should avoid it, it will not provide any benefit, quite the opposite as you can already see from your results. Keep in mind the difference between a physical core and a thread.
RolandS likes this.

Last edited by gpouliasis; March 23, 2021 at 04:54.
gpouliasis is offline   Reply With Quote

Old   March 23, 2021, 04:45
Default
  #384
New Member
 
Roland Siemons
Join Date: Mar 2021
Posts: 13
Rep Power: 5
RolandS is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
I think you went on an unnecessary tangent here. The motherboard in your system should only have 16 DIMM slots. You stated earlier that you have 16x8GB of RAM installed. There is just no way to get an unbalanced memory population this way. Just open the side panel and check if all slots are populated. Then check with the operating system that 128GB of RAM are present. CPU-Z won't help you with dual-socket systems.

I am not sure what kind of virtualization you are running here. But if I had to guess, that's probably what is causing the performance hit.

Hi Flotus,


All physical stuff is there. Not only from visual observation, also from system performance information (all graphs are there, and operational).
The mere question is to find the best configuration.
The OpenFOAM-v2006 windows version that I used was cross-compiled in OpenSUSE environment using mingw cross-compiler. (as prepared for the FreeCAD windows software).


Well I found some interesting settings. See next message.


Best regards,


Roland

Last edited by RolandS; March 23, 2021 at 04:58. Reason: improving message
RolandS is offline   Reply With Quote

Old   March 23, 2021, 04:57
Default
  #385
New Member
 
Roland Siemons
Join Date: Mar 2021
Posts: 13
Rep Power: 5
RolandS is on a distinguished road
Dear Will, Flotus, George,

Thanks for all your effort and advices.

I turned my computer into a dual boot machine: Win10 + LinuxMint.
The Linux results are drastically improved.

As I said, under Win10 I operate OpenFOAM-v2006 under the mingw cross-compiler (as prepared for the FreeCAD windows software).


The Linux results are:

# cores Wall time (s):
------------------------
6 - 163.36
10 - 109.74
14 - 89.72
18 - 81.18
22 - 77.7
24 - 76.53

In the attached graph you see the incredible speed-up of operating openFOAM directly under Linux.

Unless you see other issues to address, I am happy with this result.

Best regards,

Roland
Attached Images
File Type: png TestResults.png (55.4 KB, 104 views)
flotus1 and wkernkamp like this.
RolandS is offline   Reply With Quote

Old   March 28, 2021, 19:50
Default Congratulations!
  #386
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 343
Rep Power: 13
wkernkamp is on a distinguished road
Quote:
Originally Posted by RolandS View Post
Dear Will, Flotus, George,

Thanks for all your effort and advices.

I turned my computer into a dual boot machine: Win10 + LinuxMint.
The Linux results are drastically improved.


The Linux results are:

# cores Wall time (s):
------------------------
6 - 163.36
10 - 109.74
14 - 89.72
18 - 81.18
22 - 77.7
24 - 76.53


Unless you see other issues to address, I am happy with this result.

Best regards,

Roland

That looks as expected.


Congratulations on your fast and cheap machine.


Will
RolandS likes this.
wkernkamp is offline   Reply With Quote

Old   March 29, 2021, 22:45
Default Dual e5-2630 v3
  #387
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 343
Rep Power: 13
wkernkamp is on a distinguished road
Meshing Times:
1 1651.15
2 1099.88
4 612.52
8 404.99
12 326.62
16 329.21
Flow Calculation:
1 1113.39
2 589.07
4 281.81
8 163.56
12 132
16 115.62


I am going to try turboboost unlock next
wkernkamp is offline   Reply With Quote

Old   April 19, 2021, 20:29
Default Dual E5-4627v3
  #388
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 343
Rep Power: 13
wkernkamp is on a distinguished road
Gigabyte R180-F34 Dual E5-4627 v3 with 8x8Gb R2x8 2400T (running at 2133 due to v3 processor)

Flow Calculation:
2 578.12
4 277
8 147.97
12 111
16 92.67
20 85.04
wkernkamp is offline   Reply With Quote

Old   April 30, 2021, 01:32
Default
  #389
New Member
 
Harris Snyder
Join Date: Aug 2018
Posts: 24
Rep Power: 7
hsnyder is on a distinguished road
Full benchmark results to follow, but as a heads up to anyone running Epyc Rome... Try changing the numa nodes per socket in the BIOS. I was able to get the 32 core benchmark time down from around 26.5s to around 24.0s by changing from NPS1 to NPS4. This is on a dual-7302 system.
wkernkamp likes this.
hsnyder is offline   Reply With Quote

Old   May 10, 2021, 02:40
Default Ryzen 3800x
  #390
New Member
 
Florian
Join Date: May 2021
Posts: 8
Rep Power: 5
Iuvatix is on a distinguished road
I benchmarked my Ryzen 3800x with 2x16GB 3000MHz CL16-18-18-38 for 2133MHz and 3000MHz on Ubuntu 20.04.2, OpenFoam v8.

For 2133MHz

Flow Mesh
1 713s 1096s
2 457s 756s
4 330s 461s
6 313s 375s
8 315s 341s


For 3000MHz

Flow Mesh
1 658s 1030s
2 379s 702s
4 261s 419s
6 244s 335s
8 245s 304s


It's comparable to the 5600x already posted, but slower than the 3700x. But both of these CPUs run on 3600MHz RAM and Ryzen can use 3200MHz afaik. Do you think it would be worth to upgrade my RAM?
Iuvatix is offline   Reply With Quote

Old   May 10, 2021, 03:31
Default
  #391
Senior Member
 
Simbelmynė's Avatar
 
Join Date: May 2012
Posts: 548
Rep Power: 16
Simbelmynė is on a distinguished road
Quote:
Originally Posted by Iuvatix View Post
I benchmarked my Ryzen 3800x with 2x16GB 3000MHz CL16-18-18-38 for 2133MHz and 3000MHz on Ubuntu 20.04.2, OpenFoam v8.

For 2133MHz

Flow Mesh
1 713s 1096s
2 457s 756s
4 330s 461s
6 313s 375s
8 315s 341s


For 3000MHz

Flow Mesh
1 658s 1030s
2 379s 702s
4 261s 419s
6 244s 335s
8 245s 304s


It's comparable to the 5600x already posted, but slower than the 3700x. But both of these CPUs run on 3600MHz RAM and Ryzen can use 3200MHz afaik. Do you think it would be worth to upgrade my RAM?

Is 3000 MHz the ceiling of your RAM?


Have you tried the Ryzen DRAM Calculator? You may be able to tighten those timings substantially if you are lucky.


Is it worth upgrading your RAM? This should be easy enough to answer for yourself by looking at the results posted so far. You know the price and you know what results you may accomplish. Is this computer used to run CFD 24/7? Then I would say an upgrade may be worth it. On the other hand, you should probably look at other setups as well in this case imho.



My RAM I used for the 3700X results (around 170s for the benchmark) are single rank Samsung b-die, binned at 4133 MHz @ CL18. They easily do 3600 MHz @ CL 15 (I can even push them to stable 3600 MHz @ CL 14, with higher voltage). If you go for 3600 MHz memory then I would suggest CL16 memory, two sticks with two ranks, or four sticks of single rank memory. If you can get that to work I think it would be the sweet spot for your CPU. Your motherboard vendor will likely have a list of qualified memory. That can give you an indication of the capabilities of your motherboard and how good the PCB layout is. The rest is up to the silicon lottery of your CPU.
Simbelmynė is offline   Reply With Quote

Old   May 10, 2021, 04:16
Default
  #392
New Member
 
Florian
Join Date: May 2021
Posts: 8
Rep Power: 5
Iuvatix is on a distinguished road
Quote:
Originally Posted by Simbelmynė View Post
Is 3000 MHz the ceiling of your RAM?


Have you tried the Ryzen DRAM Calculator? You may be able to tighten those timings substantially if you are lucky.


Is it worth upgrading your RAM? This should be easy enough to answer for yourself by looking at the results posted so far. You know the price and you know what results you may accomplish. Is this computer used to run CFD 24/7? Then I would say an upgrade may be worth it. On the other hand, you should probably look at other setups as well in this case imho.



My RAM I used for the 3700X results (around 170s for the benchmark) are single rank Samsung b-die, binned at 4133 MHz @ CL18. They easily do 3600 MHz @ CL 15 (I can even push them to stable 3600 MHz @ CL 14, with higher voltage). If you go for 3600 MHz memory then I would suggest CL16 memory, two sticks with two ranks, or four sticks of single rank memory. If you can get that to work I think it would be the sweet spot for your CPU. Your motherboard vendor will likely have a list of qualified memory. That can give you an indication of the capabilities of your motherboard and how good the PCB layout is. The rest is up to the silicon lottery of your CPU.

Thanks for your answer



I use this RAM https://geizhals.de/g-skill-aegis-di...-a1798024.html. So I guess 3000MHz is the ceiling of my RAM, but I am new to this kind of hardware stuff.



I did not use the DRAM calculator, but will try and see if I can improve performance.


If I could gain an substantial performance increase like 15% with better RAM compared to this setup, it probably would be worth it, but this machine is only for smaller simulations.
Iuvatix is offline   Reply With Quote

Old   May 10, 2021, 05:46
Default
  #393
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,406
Rep Power: 47
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Nothing prevents you from overclocking the memory you currently have. That's an easy way to find out if faster memory is worth it to you. And you will probably get within 10% of what could be achieved with higher binned memory modules. Ryzen DRAM calculator is a handy tool for this, especially if you are overwhelmed by the plethora of timing settings.
You already saw the performance improvements going from 2133MT/s to 3000MT/s. If you only control for memory frequency, extrapolating this trend linearly is a good enough estimate for the performance at even higher transfer rates.
These are your options:
1) leave everything as-is
2) Manually tune your current memory for higher transfer rates and optimised timings.
3) Buy expensive memory like 3600 CL16, apply XMP without any manual tuning. About the same performance as option 2
4) Same as 3, but further optimise manually.
The best option for you depends on how much time you want to spend manually tuning memory frequency, latency and related voltages. The sweet-spot for Zen2 Ryzen CPUs in terms of transfer rates is at DDR4-3600 with a 1:1 ratio of DRAM and infinity fabric. Most of these CPUs don't achieve higher IF clock speeds without much hassle, and switching to a 2:1 ratio for higher memory frequency is not worth it.

Last edited by flotus1; May 10, 2021 at 08:25.
flotus1 is offline   Reply With Quote

Old   May 28, 2021, 11:13
Default AMD Epyc 7532
  #394
New Member
 
Josh Dyson
Join Date: Mar 2011
Posts: 21
Rep Power: 15
jd210 is on a distinguished road
Been wanting to add to this benchmark for some time and finally been able to do so.

OpenFOAM-v2012 running on CentOS 7.9. 2x AMD Epyc 7532 with 1TB of 3200mhz RAM. AMD equivalent to hyper threading switched off.

# cores Wall time (s):
------------------------
1 643.75
4 158.48
8 77.35
16 43.92
32 23.68
48 19.69
64 15.97
128 8.94

The 128 core result comes from a 100Gb InfiniBand connection to an identical node.

Super linear speed up to 8 cores and 6.25 iterations/s on 64 cores showing 2nd gen Zen is a big step from 1st.
flotus1, zyzycomcn and Crowdion like this.
jd210 is offline   Reply With Quote

Old   May 29, 2021, 00:06
Default Apple M1
  #395
Member
 
Join Date: Jun 2016
Posts: 99
Rep Power: 9
xuegy is on a distinguished road
Apple M1 Mac mini 16GB 4 big cores @ 3.2GHz
OF-v2012 compiled in native ARM64(still buggy, but I managed to run this benchmark). No SIMD optimization yet. No GPU acceleration yet.
# cores Wall time (s):
------------------------
1 469.16
2 291.02
3 228.07
4 190.39*
Crashed at t=100s for some reason, so I used 99s time = 188.49*100/99

So seems like M1 single-core outperformed all x86 PCs. But the MPI scaling is really bad. Not sure if it's M1 issue or openmpi issue.
xuegy is offline   Reply With Quote

Old   May 29, 2021, 01:55
Unhappy Probably Memory Bandwidth
  #396
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 343
Rep Power: 13
wkernkamp is on a distinguished road
OpenFoam always chokes on the memory channels not being sufficient to make all the cores productive, while the (openmpi) parallel interface is not a big problem. So, you might look at your installed memory (type, frequency, slots filled.)
wkernkamp is offline   Reply With Quote

Old   May 29, 2021, 02:16
Default
  #397
Member
 
Join Date: Jun 2016
Posts: 99
Rep Power: 9
xuegy is on a distinguished road
Quote:
Originally Posted by wkernkamp View Post
OpenFoam always chokes on the memory channels not being sufficient to make all the cores productive, while the (openmpi) parallel interface is not a big problem. So, you might look at your installed memory (type, frequency, slots filled.)
M1 has 68GB/s bandwidth from LPDDR4X-4266 so it's already workstation level. However it soldered directly on CPU so I don't have a choice. I would say it's either Apple's design problem, or openmpi is not optimized for M1. (because M1 is very different from regular ARM64)
xuegy is offline   Reply With Quote

Old   May 29, 2021, 05:08
Default
  #398
Senior Member
 
Join Date: Apr 2020
Location: UK
Posts: 672
Rep Power: 14
Tobermory will become famous soon enough
Impressive single-core clock time! 50% faster than my 3GHz Epyc 7302. I wonder how it is managing this? Clearly is getting more done each clock cycle ...
Tobermory is offline   Reply With Quote

Old   May 29, 2021, 07:39
Default
  #399
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,406
Rep Power: 47
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Quote:
Originally Posted by xuegy View Post
So seems like M1 single-core outperformed all x86 PCs. But the MPI scaling is really bad. Not sure if it's M1 issue or openmpi issue.
Not quite. It is pretty much on par with current-gen mainstream CPUs, which is about what I would have guessed. Slightly trailing behind the Ryzen 5 5600x results posted two pages earlier. Not too shabby considering the presumably lower power consumption. But to be fair, one could optimize a desktop CPU for lower power consumption. As far as I am concerned, this is far from a no-brainer for your intended application of using Mac minis as compute nodes in a cluster.
Edit: dual-channel DDR4-3600 also yields memory bandwidth north of 50GB/s. And is the limiting factor for scaling on the 6-core Ryzen CPU.
oswald likes this.
flotus1 is offline   Reply With Quote

Old   May 29, 2021, 08:42
Default
  #400
Member
 
Join Date: Jun 2016
Posts: 99
Rep Power: 9
xuegy is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
As far as I am concerned, this is far from a no-brainer for your intended application of using Mac minis as compute nodes in a cluster.
Honestly that idea was just for fun. I bought a Mac mini to replace my desktop so no harm to test it. Obviously I will continue use my x86 workstation with 128GB RAM. That $180 I paid for extra 8 GB RAM might be most expensive 8GB RAM on cfd-online.

In the near future we will definitely see more ARM64 servers running not only CFD but also other scientific computing tasks.

Last edited by xuegy; May 29, 2021 at 09:20. Reason: wrong decimal point
xuegy is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology wyldckat OpenFOAM 17 November 10, 2017 15:54
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days joegi.geo OpenFOAM Announcements from Other Sources 0 October 1, 2016 19:20
OpenFOAM Training Beijing 22-26 Aug 2016 cfd.direct OpenFOAM Announcements from Other Sources 0 May 3, 2016 04:57
New OpenFOAM Forum Structure jola OpenFOAM 2 October 19, 2011 06:55
Hardware for OpenFOAM LES LijieNPIC Hardware 0 November 8, 2010 09:54


All times are GMT -4. The time now is 02:27.