|
[Sponsors] |
![]() |
![]() |
#841 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,430
Rep Power: 49 ![]() ![]() |
A slight over-simplification for why more cache=more better:
When an execution unit needs data to perform a calculation, the cache hierarchy is searched first for that data. L1 -> L2 -> L3 -> RAM. Because the latency increases the further down we go this chain. Larger caches mean a higher probability that the data already resides in one of the caches, thus doesn't have to come from RAM. This decreases latency, and also frees up precious memory bandwidth for other operations and cores. |
|
![]() |
![]() |
![]() |
![]() |
#842 | |
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 566
Rep Power: 21 ![]() |
Quote:
I don't know where your 30% come from. I went from 128MB to 256MB cache with Epyc2 and there are maybe about 3% difference for the same core count but not more. |
||
![]() |
![]() |
![]() |
![]() |
#843 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 381
Rep Power: 14 ![]() |
Could you give some information on your problem: I.e. number of cells, shape of domain and memory configurations. That would be of interest, because flotus is right that we see a clear benefit from large caches as it reduces the demand on the memory channels.
|
|
![]() |
![]() |
![]() |
![]() |
#844 | |
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 566
Rep Power: 21 ![]() |
Quote:
There is no problem at all. Everything works as expected. I would just like to know where the 30% is mentioned. In any case, I can't remember reading such a number. |
||
![]() |
![]() |
![]() |
![]() |
#845 | |
Senior Member
andy
Join Date: May 2009
Posts: 347
Rep Power: 18 ![]() |
Quote:
|
||
![]() |
![]() |
![]() |
![]() |
#846 |
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 566
Rep Power: 21 ![]() |
||
![]() |
![]() |
![]() |
![]() |
#847 | |
Member
|
Quote:
OpenFOAM benchmarks on various hardware OpenFOAM benchmarks on various hardware OpenFOAM benchmarks on various hardware OpenFOAM benchmarks on various hardware OpenFOAM benchmarks on various hardware OpenFOAM benchmarks on various hardware 7532 finishes the benchmark run as fast as 16s (my personal test is 18s), whereas 7542 (128MB L3 but slightly higher freq?) is around 22s or even lower. I do remember that there are other similar models (simiar setup but different L3), if you are really interest you could skim this thread. |
||
![]() |
![]() |
![]() |
![]() |
#848 | |
Member
|
This is also true for desktop cpus (ast least for zen and zen2), ryzen 2200G is much slower than 1500X.
Quote:
|
||
![]() |
![]() |
![]() |
![]() |
#849 |
Super Moderator
Philip Cardiff
Join Date: Mar 2009
Location: Dublin, Ireland
Posts: 1,104
Rep Power: 35 ![]() ![]() |
FYI:
The 1st OpenFOAM HPC Challenge (OHC-1) at the upcoming OpenFOAM Workshop may be of interest to people here. I expect they will publicly share their results, which will be interesting. |
|
![]() |
![]() |
![]() |
![]() |
#850 | |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 381
Rep Power: 14 ![]() |
Quote:
|
||
![]() |
![]() |
![]() |
![]() |
#851 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 381
Rep Power: 14 ![]() |
I ran the benchmark with the normal double precision DP compile of OF v2212 and the mixed precision SPDP compilation option (in etc/bashrc). The system is a dual E5-2696v2 server with 128GB of DDR3-1866 with eight memory channels total.
The flow calculation is much faster up to 36% for all core counts run. However, the mesh generation is slower by 20%. Double Precision DP Code:
Meshing Times:1 1553.65 2 1011.88 4 574.82 8 344.48 12 260.31 16 232.71 20 197.04 24 183.96 Flow Calculation: 1 981.95 2 511.16 4 233.53 8 130.49 12 103.84 16 92.15 20 87.93 24 87.1 Code:
Meshing Times: 1 1403.45 2 998.06 4 560.6 8 367.34 12 305.32 16 263.96 20 238.15 24 225.01 Flow Calculation: 1 640.32 2 378.16 4 173.62 8 97.13 12 71.99 16 61.06 20 56.09 24 55.23 |
|
![]() |
![]() |
![]() |
![]() |
#852 |
Senior Member
Join Date: May 2012
Posts: 563
Rep Power: 16 ![]() |
Mac Studio M4 Max, 16c CPU 40c GPU, 64 GB RAM
MacOS Sequoia 15.4 OpenFOAM v2412, Ubuntu 24.04 ARM running under OrbStack Using gumersindu's updated version from post 808. Code:
cores MeshTime(s) RunTime(s) ----------------------------------- 12 88.9 34.52 Code:
cores MeshTime(s) RunTime(s) ----------------------------------- 12 97.32 37.75 Last edited by Simbelmynė; April 3, 2025 at 01:40. |
|
![]() |
![]() |
![]() |
![]() |
#853 | |
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 566
Rep Power: 21 ![]() |
Quote:
A short comparison between OF and Wildkatze regarding the time up to the first iteration. https://t.me/wildkatze_cfd/39 # Update Some values (iteration time and drag) for the 110 million cell case: https://t.me/wildkatze_cfd/40 Last edited by JBeilke; April 10, 2025 at 08:37. |
||
![]() |
![]() |
![]() |
![]() |
#854 | |
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 566
Rep Power: 21 ![]() |
Quote:
It would be interesting to see how well the drag coefficients for double precision and mixed precision match up. |
||
![]() |
![]() |
![]() |
![]() |
#855 | |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 381
Rep Power: 14 ![]() |
Quote:
Code:
With SPDP: 8 Cd: 0.415865 0.398147 0.0177189 0 12 Cd: 0.414603 0.396965 0.0176377 0 16 Cd: 0.409216 0.39174 0.0174758 0 20 Cd: 0.406088 0.388682 0.0174062 0 24 Cd: 0.415135 0.397779 0.0173551 0 With DP: 8 Cd: 0.41088 0.393231 0.0176492 0 12 Cd: 0.409777 0.392361 0.0174158 0 16 Cd: 0.403686 0.386233 0.0174535 0 20 Cd: 0.408543 0.391123 0.0174206 0 24 Cd: 0.413563 0.39623 0.0173339 0 |
||
![]() |
![]() |
![]() |
![]() |
#856 |
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 566
Rep Power: 21 ![]() |
Thank you very much Will. I had to think quickly about which benchmark you were using -- DrivAer or motorBike. But a cw of 0.4 only fits the motorBike :-)
Maybe one of the moderators can move all posts about DrivAer to a separate thread. Whether a deviation of one percent is a lot or a small amount probably depends on the situation. However, I was not really aware that the domain decomposition or the number of domains can have such an influence on the result. If I try to optimize a geometry and are already happy about a half percent improvement, a deviation of one percent as a result of domain decomposition is a medium-sized disaster. |
|
![]() |
![]() |
![]() |
![]() |
#857 | |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 381
Rep Power: 14 ![]() |
Quote:
|
||
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology | wyldckat | OpenFOAM | 17 | November 10, 2017 15:54 |
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days | joegi.geo | OpenFOAM Announcements from Other Sources | 0 | October 1, 2016 19:20 |
OpenFOAM Training Beijing 22-26 Aug 2016 | cfd.direct | OpenFOAM Announcements from Other Sources | 0 | May 3, 2016 04:57 |
New OpenFOAM Forum Structure | jola | OpenFOAM | 2 | October 19, 2011 06:55 |
Hardware for OpenFOAM LES | LijieNPIC | Hardware | 0 | November 8, 2010 09:54 |