|
[Sponsors] |
March 2, 2022, 05:59 |
Benchmark fpmem
|
#1 |
Member
Erik Andresen
Join Date: Feb 2016
Location: Denmark
Posts: 35
Rep Power: 10 |
The STREAM benchmark test the memory bandwidth, even though floating point operations are made. In the benchmark, the number of floating point operations doesn’t exceed the number of loads. This is likely also the case for many CFD programs, but for higher order solvers based on Cartesian grid, it is not the case. The ratio between number of floating point operations and loads, could be much larger for such solvers.
Optimizing in HPC is often about minimizing the reading from memory. The work can be split into smaller chunks, where as much work as possible is done on each chunk, before the next chunk of memory is processed. The relevant size of such chunks should be determined. I have made a benchmark, fpmem, that gives the floating point performance for various combinations of floating point operations pr load, and the size of the array processed. The benchmark doesn’t do any real work, but it can be compiled, linked and run in about 5 minutes. The instructions for compiling, linking and usage of the benchmark is given in the first few lines of the source file. It requires a resent C++ compiler (-std=c++17) and mpi. It uses AVX2 when compiled with -D_USE_INTRINSIC. See instructions. I hope that some care to use the benchmark and post the results. The benchmark is made to run on one CPU, and if used on a large cluster the performance will just increase linearly with the number of CPUs. I don’t have access to EPYC Milan or newer Xeons on socket LGA4189 so for me results from these could be very interesting. I have attached the benchmark (fpmem.c) and the results for my newly build system with an Intel i5-12600. Edit: I have uploaded a new version, that corrects an error that effected the reported performance values with up to about 10%. Last edited by ErikAdr; March 3, 2022 at 05:55. |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[snappyHexMesh] Motobike benchmark case | joshmccraney | OpenFOAM Meshing & Mesh Conversion | 6 | March 26, 2020 16:28 |
Setting up Lid driven Cavity Benchmark with 1M cells for multiple cores | puneet336 | OpenFOAM Running, Solving & CFD | 11 | April 7, 2019 00:58 |
Benchmark Commannd Line | eRzBeNgEl | STAR-CCM+ | 2 | February 17, 2013 15:27 |
Euler3d Benchmark | Verdi | Hardware | 2 | May 26, 2011 06:21 |
SIG HPC Benchmark | jens_klostermann | OpenFOAM | 0 | October 1, 2009 18:20 |