|
[Sponsors] | |||||
|
|
|
#841 |
|
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,460
Rep Power: 50 ![]() ![]() |
A slight over-simplification for why more cache=more better:
When an execution unit needs data to perform a calculation, the cache hierarchy is searched first for that data. L1 -> L2 -> L3 -> RAM. Because the latency increases the further down we go this chain. Larger caches mean a higher probability that the data already resides in one of the caches, thus doesn't have to come from RAM. This decreases latency, and also frees up precious memory bandwidth for other operations and cores. |
|
|
|
|
|
|
|
|
#842 | |
|
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 602
Rep Power: 22 ![]() |
Quote:
I don't know where your 30% come from. I went from 128MB to 256MB cache with Epyc2 and there are maybe about 3% difference for the same core count but not more. |
||
|
|
|
||
|
|
|
#843 |
|
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 388
Rep Power: 15 ![]() |
Could you give some information on your problem: I.e. number of cells, shape of domain and memory configurations. That would be of interest, because flotus is right that we see a clear benefit from large caches as it reduces the demand on the memory channels.
|
|
|
|
|
|
|
|
|
#844 | |
|
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 602
Rep Power: 22 ![]() |
Quote:
There is no problem at all. Everything works as expected. I would just like to know where the 30% is mentioned. In any case, I can't remember reading such a number. |
||
|
|
|
||
|
|
|
#845 | |
|
Senior Member
andy
Join Date: May 2009
Posts: 358
Rep Power: 19 ![]() |
Quote:
|
||
|
|
|
||
|
|
|
#846 |
|
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 602
Rep Power: 22 ![]() |
||
|
|
|
|
|
|
|
#847 | |
|
Member
|
Quote:
OpenFOAM benchmarks on various hardware OpenFOAM benchmarks on various hardware OpenFOAM benchmarks on various hardware OpenFOAM benchmarks on various hardware OpenFOAM benchmarks on various hardware OpenFOAM benchmarks on various hardware 7532 finishes the benchmark run as fast as 16s (my personal test is 18s), whereas 7542 (128MB L3 but slightly higher freq?) is around 22s or even lower. I do remember that there are other similar models (simiar setup but different L3), if you are really interest you could skim this thread. |
||
|
|
|
||
|
|
|
#848 | |
|
Member
|
This is also true for desktop cpus (ast least for zen and zen2), ryzen 2200G is much slower than 1500X.
Quote:
|
||
|
|
|
||
|
|
|
#849 |
|
Super Moderator
Philip Cardiff
Join Date: Mar 2009
Location: Dublin, Ireland
Posts: 1,113
Rep Power: 35 ![]() ![]() |
FYI:
The 1st OpenFOAM HPC Challenge (OHC-1) at the upcoming OpenFOAM Workshop may be of interest to people here. I expect they will publicly share their results, which will be interesting. |
|
|
|
|
|
|
|
|
#850 | |
|
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 388
Rep Power: 15 ![]() |
Quote:
|
||
|
|
|
||
|
|
|
#851 |
|
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 388
Rep Power: 15 ![]() |
I ran the benchmark with the normal double precision DP compile of OF v2212 and the mixed precision SPDP compilation option (in etc/bashrc). The system is a dual E5-2696v2 server with 128GB of DDR3-1866 with eight memory channels total.
The flow calculation is much faster up to 36% for all core counts run. However, the mesh generation is slower by 20%. Double Precision DP Code:
Meshing Times:1 1553.65 2 1011.88 4 574.82 8 344.48 12 260.31 16 232.71 20 197.04 24 183.96 Flow Calculation: 1 981.95 2 511.16 4 233.53 8 130.49 12 103.84 16 92.15 20 87.93 24 87.1 Code:
Meshing Times: 1 1403.45 2 998.06 4 560.6 8 367.34 12 305.32 16 263.96 20 238.15 24 225.01 Flow Calculation: 1 640.32 2 378.16 4 173.62 8 97.13 12 71.99 16 61.06 20 56.09 24 55.23 |
|
|
|
|
|
|
|
|
#852 |
|
Senior Member
Join Date: May 2012
Posts: 564
Rep Power: 17 ![]() |
Mac Studio M4 Max, 16c CPU 40c GPU, 64 GB RAM
MacOS Sequoia 15.4 OpenFOAM v2412, Ubuntu 24.04 ARM running under OrbStack Using gumersindu's updated version from post 808. Code:
cores MeshTime(s) RunTime(s) ----------------------------------- 12 88.9 34.52 Code:
cores MeshTime(s) RunTime(s) ----------------------------------- 12 97.32 37.75 Last edited by Simbelmynė; April 3, 2025 at 02:40. |
|
|
|
|
|
|
|
|
#853 | |
|
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 602
Rep Power: 22 ![]() |
Quote:
A short comparison between OF and Wildkatze regarding the time up to the first iteration. https://t.me/wildkatze_cfd/39 # Update Some values (iteration time and drag) for the 110 million cell case: https://t.me/wildkatze_cfd/40 Last edited by JBeilke; April 10, 2025 at 09:37. |
||
|
|
|
||
|
|
|
#854 | |
|
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 602
Rep Power: 22 ![]() |
Quote:
It would be interesting to see how well the drag coefficients for double precision and mixed precision match up. |
||
|
|
|
||
|
|
|
#855 | |
|
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 388
Rep Power: 15 ![]() |
Quote:
Code:
With SPDP: 8 Cd: 0.415865 0.398147 0.0177189 0 12 Cd: 0.414603 0.396965 0.0176377 0 16 Cd: 0.409216 0.39174 0.0174758 0 20 Cd: 0.406088 0.388682 0.0174062 0 24 Cd: 0.415135 0.397779 0.0173551 0 With DP: 8 Cd: 0.41088 0.393231 0.0176492 0 12 Cd: 0.409777 0.392361 0.0174158 0 16 Cd: 0.403686 0.386233 0.0174535 0 20 Cd: 0.408543 0.391123 0.0174206 0 24 Cd: 0.413563 0.39623 0.0173339 0 |
||
|
|
|
||
|
|
|
#856 |
|
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 602
Rep Power: 22 ![]() |
Thank you very much Will. I had to think quickly about which benchmark you were using -- DrivAer or motorBike. But a cw of 0.4 only fits the motorBike :-)
Maybe one of the moderators can move all posts about DrivAer to a separate thread. Whether a deviation of one percent is a lot or a small amount probably depends on the situation. However, I was not really aware that the domain decomposition or the number of domains can have such an influence on the result. If I try to optimize a geometry and are already happy about a half percent improvement, a deviation of one percent as a result of domain decomposition is a medium-sized disaster. |
|
|
|
|
|
|
|
|
#857 | |
|
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 388
Rep Power: 15 ![]() |
Quote:
|
||
|
|
|
||
|
|
|
#858 |
|
New Member
Aaron K
Join Date: Mar 2024
Posts: 4
Rep Power: 3 ![]() |
Built a dual epyc 7532 workstation 16*3200 DDR4, Ubuntu 22.04 running Openfoam v2412
# cores Wall time (s): ------------------------ 1 831.22 2 401.86 4 176.74 8 84.68 16 43.11 24 30.13 32 23.49 64 17.61 |
|
|
|
|
|
|
|
|
#859 |
|
New Member
Aaron K
Join Date: Mar 2024
Posts: 4
Rep Power: 3 ![]() |
Further to the last post I set NPS4 and ACPI SLIT remote relative distance to 'far'.
Dual epyc 7532 16*3200 DDR4, Ubuntu 22.04, OpenFoam v2412 Code:
# cores Wall time (s): ------------------------ 1 799.48 2 457.75 4 169.67 8 82.72 16 42.08 24 29.18 32 22.45 64 16.15 |
|
|
|
|
|
|
|
|
#860 | |
|
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 388
Rep Power: 15 ![]() |
Quote:
Last edited by wkernkamp; May 2, 2025 at 17:27. |
||
|
|
|
||
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| How to contribute to the community of OpenFOAM users and to the OpenFOAM technology | wyldckat | OpenFOAM | 17 | November 10, 2017 16:54 |
| UNIGE February 13th-17th - 2107. OpenFOAM advaced training days | joegi.geo | OpenFOAM Announcements from Other Sources | 0 | October 1, 2016 20:20 |
| OpenFOAM Training Beijing 22-26 Aug 2016 | cfd.direct | OpenFOAM Announcements from Other Sources | 0 | May 3, 2016 05:57 |
| New OpenFOAM Forum Structure | jola | OpenFOAM | 2 | October 19, 2011 07:55 |
| Hardware for OpenFOAM LES | LijieNPIC | Hardware | 0 | November 8, 2010 10:54 |