CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Running, Solving & CFD (https://www.cfd-online.com/Forums/openfoam-solving/)
-   -   Unlimited caching on Linux results in long simulations hitting swap (https://www.cfd-online.com/Forums/openfoam-solving/196311-unlimited-caching-linux-results-long-simulations-hitting-swap.html)

cbcoutinho November 30, 2017 20:13

Unlimited caching on Linux results in long simulations hitting swap
 
1 Attachment(s)
Hello,

I am currently running some simulations as part of a parameter sweep on a modest Linux workstation (20-core, 128GB RAM, OpenFOAM 4.x on OpenSUSE Linux 42.2 Leap).

I should probably also note that I'm running these simulations through PyFoam, maybe that makes a difference.

I've noticed that as one of my simulation progresses, cached memory seems to keep going up. I've checked the cached memory usage on my machine using
Code:

fincore
on the working directory, and it looks like OpenFOAM is caching almost everything - even old timestep data (see snipped output below, and full copy attached)

I used to set the 'purgeWrite' option to 0 and just hold on to everything, but even when I set that to something like 5, it still keeps growing. At the time time of writing, I have about 4 GB of cache - most of which isn't needed. But this isn't ideal. There are cases when I want all of the resulting output data, but only on disk.

When I didn't use the purgeWrite option, a simulation using a anywhere between 5-10 GB ended up using 60-80GB of cached memory before I cleared it using the following command:

Code:

sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
Now that I'm using purgeWrite, it seems to stay under control somewhat, but I still think that this is undesired behavior, because I can't save long running time-dependent simulations without having to manually clear the cache. As it stands now, I need to keep coming back to my machine every day or two just to clear the cache, at risk of running into a swapped memory situation

Has anyone else dealt with this before? And if so, what is the proper way of dealing with this?

Code:

filename                                                                                      size        total_pages    min_cached page      cached_pages        cached_size        cached_perc
--------                                                                                      ----        -----------    ---------------      ------------        -----------        -----------
./0.org/C                                                                                    1,773                  1                  0                  1              4,096            100.00
./0.org/p                                                                                    1,380                  1                  0                  1              4,096            100.00
./constant/polyMesh/.gitignore                                                                  14                  1                  0                  1              4,096            100.00
./constant/polyMesh/points                                                              153,447,808            37,463                  0            37,463        153,448,448            100.00
./constant/polyMesh/neighbour                                                            68,336,322            16,684                  0            16,684        68,337,664            100.00
./constant/polyMesh/faces                                                              306,676,372            74,873                  0            74,873        306,679,808            100.00
./constant/polyMesh/owner                                                                69,687,860            17,014                  0            17,014        69,689,344            100.00
./constant/polyMesh/boundary                                                                  2,641                  1                  0                  1              4,096            100.00
.
.
.
.
./PyFoamHistory                                                                                235                  1                  0                  1              4,096            100.00
./PyFoamServer.info                                                                              25                  1                  0                  1              4,096            100.00
./PyFoamRunner.simpleSaltPeriodicFoam.analyzed/pickledStartData                                476                  1                  0                  1              4,096            100.00
./PyFoamRunner.simpleSaltPeriodicFoam.analyzed/pickledPlots                                210,270                52                  0                52            212,992            100.00
./PyFoamRunner.simpleSaltPeriodicFoam.analyzed/pickledUnfinishedData                          1,561                  1                  0                  1              4,096            100.00
./PyFoamState.LogDir                                                                            47                  1                  0                  1              4,096            100.00
./PyFoamRunner.simpleSaltPeriodicFoam.logfile                                            1,044,300                255                  0                255          1,044,480            100.00
./PyFoamState.StartedAt                                                                          25                  1                  0                  1              4,096            100.00
./PyFoamState.TheState                                                                            8                  1                  0                  1              4,096            100.00
./PyFoamState.LastOutputSeen                                                                    25                  1                  0                  1              4,096            100.00
./PyFoamState.CurrentTime                                                                        6                  1                  0                  1              4,096            100.00
./postProcessing/lowerWallRegion/0/surfaceRegion.dat                                        18,351                  5                  0                  5            20,480            100.00
./postProcessing/upperWallRegion/0/surfaceRegion.dat                                        18,351                  5                  0                  5            20,480            100.00
./postProcessing/profileRegion/0/surfaceRegion.dat                                          18,349                  5                  0                  5            20,480            100.00
./PyFoamRunner.simpleSaltPeriodicFoam.logfile.analyzed/pickledPlots                        138,980                34                  0                34            139,264            100.00
---
total cached size: 4,438,556,672

Attachment 59901

wyldckat December 30, 2017 19:23

Quick answer: https://www.cfd-online.com/Forums/op...too-large.html

----


Edit: Sorry, I read your post too fast and mixed up information while I was moving several threads about PyFoam.

I've had this issue in the past, namely that the cache would not flush automatically and it would end up going into the Linux swap. The problem is that I never figured out what exactly got broken and that resulted in this.
The fix that has always worked was to simply do a cold reboot of the machine in question (safely/normally shut it down and turn it back on after a minute or so). After that, I was never able to reproduce the same error for isolating the problem.

My guess is that there was some bad memory access by some library or application, which resulted in the cache management system to go nuts and do this.


On the other hand, whenever it does not reach swap, then caching files should not hurt anything, since it's only caching for possible future re-reading of the read and/or written files. Technically, the files should be safely written to disk.


All times are GMT -4. The time now is 10:44.