CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Running, Solving & CFD (https://www.cfd-online.com/Forums/openfoam-solving/)
-   -   problem in running the case in linux cluster, which works fine on local machine (https://www.cfd-online.com/Forums/openfoam-solving/234561-problem-running-case-linux-cluster-works-fine-local-machine.html)

atul1018 March 10, 2021 13:43

problem in running the case in linux cluster, which works fine on local machine
 
Dear Foamers


I am facing a strange problem and unable to solve the issue myself. I have installed used openfoam-v1912 (with default openMPI) on my local system. The same version of openfoam-v1912 is also installed on linux cluster of our university with intel MPI. I need to solve my own case using DPMFoam, which would take months to simulate on my system, so decided to switch on linux cluster.



first, I tested the Goldschmidt tutorial case (using DPMFoam) on my local linux machine and linux cluster as well. The tutorial case works fine in both system.
But when I tried with my own setup (original case) in linux cluster (with openfoam-v1912), I have encountered the problem that while evolving the kinematiccloud, the case got stuck there in very first time step and doesn't proceed any further . It gives the error saying:

Evolving kinematicCloud
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: error: *** JOB 203821 ON i23r01c03s02 CANCELLED AT 2021-03-09T00:50:12 DUE TO TIME LIMIT ***

================================================== =================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 84 PID 2221 RUNNING AT i23r02c05s06
= KILLED BY SIGNAL: 15 (Terminated)
================================================== =================================

Interesting thing is that, I have also openfoam-v1912 installed in my local system and the same case (my original case) runs well in my local system but does not give any such error.


I have no clue, why is it happening. Is it the different MPI installed on local linux machine and linux cluster causing the problem. if MPI is the problem, then why tutorial case worked well on both systems?



Looking forward to get expert advice. Its been a month I am stuck on this issue, Please HELP.


Best Regards
Atul

atul1018 March 11, 2021 10:49

An update:when the case in run in series mode (interactively) also in linux cluster, This time simulation started and runs well for some time steps (4-5 time steps) but again gives the error about this time limit. When run in series, the output printed on screen look like this (i am showing the last time step output till the simulation run and error reported):


Courant Number mean: 0.0218179 max: 0.497288
deltaT = 8.58341e-06
Time = 3.75918e-05

Evolving kinematicCloud

Solving 3-D cloud kinematicCloud
Cloud: kinematicCloud
Current number of parcels = 0
Current mass in system = 0
Linear momentum = (0 0 0)
|Linear momentum| = 0
Linear kinetic energy = 0
Average particle per parcel = 0
Injector model1:
- parcels added = 0
- mass introduced = 0
Parcel fate: system (number, mass)
- escape = 0, 0
Parcel fate: patch (number, mass) inlet
- escape = 0, 0
- stick = 0, 0
Parcel fate: patch (number, mass) outlet
- escape = 0, 0
- stick = 0, 0
Parcel fate: patch (number, mass) upperWall
- escape = 0, 0
- stick = 0, 0
Parcel fate: patch (number, mass) lowerWall
- escape = 0, 0
- stick = 0, 0
Parcel fate: patch (number, mass) sides_half0
- escape = 0, 0
- stick = 0, 0
Parcel fate: patch (number, mass) sides_half1
- escape = 0, 0
- stick = 0, 0
Rotational kinetic energy = 0

PIMPLE: iteration 1
smoothSolver: Solving for U.airx, Initial residual = 0.000339738, Final residual = 5.88311e-07, No Iterations 1
smoothSolver: Solving for U.airy, Initial residual = 0.000349268, Final residual = 8.97215e-07, No Iterations 1
smoothSolver: Solving for U.airz, Initial residual = 0.830691, Final residual = 7.17131e-08, No Iterations 2
GAMG: Solving for p, Initial residual = 0.13528, Final residual = 0.00945423, No Iterations 1
time step continuity errors : sum local = 7.71415e-08, global = -3.33289e-10, cumulative = -7.72008e-07
salloc: Job 158819 has exceeded its time limit and its allocation has been revoked.
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
Terminated
ge72xes2@i22r07c05s09:~/Robert_case/multi_phase_interactive_session> srun: error: i22r07c05s09: task 0: Killed


If the simulation started meaning the case is correct but gets killed immediately (in parallel) and after some time steps (in series) on cluster,why? interestingly the case runs well on my local system without any issue with same version of openfoam installed on cluster (openfoam-v1912), isn't it strange?



Please help, if anyone has encountered such problem.


Best Regards
Atul


All times are GMT -4. The time now is 10:24.