|
[Sponsors] |
March 22, 2022, 13:00 |
Need help with subpar performance
|
#1 |
Member
Ron Burnett
Join Date: Feb 2013
Posts: 42
Rep Power: 13 |
A description of the recently purchased machine......
HP DL560 G8 4x E5-4610v2 16x 8Gb 2Rx4 PC3L-12800R-11-12 (discovered that one module is actually 16Gb but otherwise identical) P420i raid controller 750 watt power supply ........to which I added a 240 Gb SSD, Ubuntu 20.04, and OF8 (from OpenFoam.org). All bios settings have been biased toward "performance", cooling adjusted to "enhanced" (loaded cpu temps < 52 C), interleaving enabled, hyperthreading off, all DIMM's in the correct socket (with this machine it's the white ones). Using bench_template_v02, it's obvious something is not right when compared with other older four socket machines such as Kailee (post #416), wildemam (#339), Morland (#158), kstuart (#260). Using the command < watch -n1 "grep "^[c]pu MHz" /proc/cpuinfo" > , which allows monitoring all cores in real time, shows a consistant 2.49GHz under load. Can anyone shed some light on the problem? |
|
March 22, 2022, 14:21 |
|
#2 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46 |
At first glance, nothing obvious stands out.
I'm not a huge fan of mismatching memory. It can lead to weird performance regression. So if you could get your hands on another identical 8GB stick, it would rule out one potential issue. On the software level, another interesting test might be probing each CPU individually. I.e. running the simulation with 8 threads, pinned to the hwthreads of each CPU. 4 tests total. Same can be done with other synthetic benchmarks, like stream or this one here: Benchmark fpmem Have you tried clearing caches first, and then directly running the benchmark on 32 threads? On a completely unrelated note: I wish more posts here were as thought out as yours. |
|
March 22, 2022, 20:23 |
|
#3 |
Member
Ron Burnett
Join Date: Feb 2013
Posts: 42
Rep Power: 13 |
Alex, the memory mismatch was disappointing, especially since I tried to impress upon the tech person the need for uniformity. It's easy enough to correct and maybe a software test will point the way. My knowledge of mpi commands is limited, what does it take to check each cpu by itself?
Clearing caches, yes I've tried that with no change. I appreciate the compliment ......and your help. |
|
March 22, 2022, 22:02 |
The test indicates a problem with your memory speed
|
#4 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 316
Rep Power: 12 |
Your technician may have used the 16Gb because one of the 8Gb he was planning to use turned out bad. If one was bad another could also.
You were specific on everything but the BIOS settings. Look at all memory related settings. Use auto where possible. There is rank interleaving and bank interleaving. With unequal size Dimms, you risk hurting performance when forcing bank interleaving. Try forcing a lower speed for the memory. A lot of the higher speed units started life at a lower clock, but were recently reprogrammed for the higher speed (by the clever Chinese). At 1333 MHz, your machine should still benchmark just above 70 seconds, I would think. |
|
March 23, 2022, 02:42 |
|
#5 | |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46 |
Quote:
According to numactl output, your OS mapped threads to cores in order. I.e. cores on the first socket/NUMA node are numbered 0-7 and so on. There are a few ways to bind threads with MPI, one of them is "cpu-set" Code:
mpirun --cpu-set 24-31 --bind-to core -np 8 ... |
||
March 29, 2022, 12:46 |
|
#6 | |||
Member
Ron Burnett
Join Date: Feb 2013
Posts: 42
Rep Power: 13 |
The mismatched module was indeed the problem.....running a benchmark on
each CPU indicated as much. It was replaced with one that matches the other 15 and wow, what a difference. New results are posted in the benchmark thread. Quote:
At some point in the future I may buy a new 16 Gb module and rerun everything. Quote:
Quote:
|
||||
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
General recommendations for CFD hardware [WIP] | flotus1 | Hardware | 18 | February 29, 2024 12:48 |
If memory bound : how to improve performance? | aerosayan | Main CFD Forum | 13 | July 7, 2021 05:44 |
Abysmal performance of 64 cores opteron based workstation for CFD | Fauster | Hardware | 8 | June 4, 2018 10:51 |
Openfoam parallel calculation performance study - Half performance on mpirun | Jurado | OpenFOAM Running, Solving & CFD | 22 | March 24, 2018 20:40 |
parallel performance on BX900 | uzawa | OpenFOAM Installation | 3 | September 5, 2011 15:52 |