CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Bugs (https://www.cfd-online.com/Forums/openfoam-bugs/)
-   -   A possible bug: Unusual slow-down using collated I/O format (https://www.cfd-online.com/Forums/openfoam-bugs/232081-possible-bug-unusual-slow-down-using-collated-i-o-format.html)

shaoX November 29, 2020 04:39

A possible bug: Unusual slow-down using collated I/O format
 
Hello Foamers,

We know that the collated I/O option in OpenFOAM can significantly reduce the total files number when we do parallel running. But recently I found this option can be much slower than uncollated I/O. I'm using v1912.

This happens both on HPC or my workstation and I've checked all the settings (eg. OptimisationSwitches) according to https://www.openfoam.com/releases/op...2/parallel.php

Here's a simple test:
Test#1
$TUTORIAL/cavity, np=10, 40,000 cells; 0-0.2s, ∆t=1e-5, 20000 steps in total; use icoFoam.
ClockTime: collated: 110s; uncollated: 61s.

We can see there's a big difference in clockTime.

It looks like this is an input-output problem. But the following test designed to stress the I/O workload shows that I/O is not the problem.

/************************************************** *******
Lid-driven cavity flow, 6M (200*200*200) cells, ∆t=1e-4, run from 0-0.01s, totally only 100 steps, np=128.
Test#2:
Write interval: every 20 steps
ClockTime(collated): 518s
ColockTime(uncollated):511s
Test#3:
Write interval: every 2 steps
ClockTime(collated): 559s
ColockTime(uncollated):526s
************************************************** ********/

We can see based on this test the difference between collated and uncollated is negligible even Test#3 writes so frequently and writes ~60GB data in total.

The difference only becomes significant when the total running steps are high and the simulation is not big size.

I did profiling to find where computational time is spent and found ++runTime, or Foam::Time:: operator++() can be very time-consuming::
Test#4
cavity, np=10, 40,000 cells; 0-0.1s, ∆t=1e-5, 10000 steps
ClockTime=1552s; ++runTime cost: 24%
this percentage can go up to 45% if I run from 0 to 0.2s, or 20000 steps.

I need to stress this difference can be small if you are running a big simulation without too many steps. But in my situation, I need to put ++runTime into a subcycling in which fast scale equations are solved with smaller ∆t while other equations are waiting.

So the total times of executing ++runTime can be very high (millions of steps) and I find this slow-down is a serious problem.

Can someone give me any suggestion on this, or point out that I just made some stupid mistakes? Thank you very much!

Alishaha2 July 2, 2021 10:47

Hi Xiao Shao,

I recently faced the same problem. I am using an explicit solver based on rhoCentralFoam and I've to use small time-steps due to high Mach flow. Apparently, as you mentioned the runtime ++ operation takes more and more time with more iterations and using the collated format. I tracked down the issue and it was from
Time.C, setTime() function. The issue is when using collated format it accumulates the time step names in a pointer list and sorts them every time during ++ operation. You can check masterUncollatedFileOperation.C and the setTime() function.
Yet I could not figure out why such is list is required for the collated format but certainly, the size of the pointer list increases with the number of iterations.
hope it helps.

Ali

HVonSch May 6, 2022 16:39

Hello everyone,


I stumbled over this too and came to the same end as shaoX.


I think this is a huge killer for LES or DNS on HPC clusters.
Collated file format is often required there because of file system limits and, depending on the number of time steps, used cpu time can be higher by factor two to ten.


I disabled the time step list accumulation and had no problems since. Does anyone know, what unwanted consequences that could have?


/src/OpenFOAM/global/fileOperations/masterUncollatedFileOperation/masterUncollatedFileOperation.C


OpenFOAM-9:
line 2335: if (iter != times_.end()) -> if (false)

OpenFOAM-v2112:
line 2246: if (iter.found()) -> if (false)


Best,
Hendrik

olesen May 7, 2022 05:22

Nice pinpointed diagnosis - I've created an issue https://develop.openfoam.com/Develop.../-/issues/2461

Please add yourself there for follow up. There may well be other aspects to discuss as well.

HVonSch May 7, 2022 12:45

Nice, thank you for the quick feedback and the bug report!
Best,
Hendrik


All times are GMT -4. The time now is 23:09.