CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

Simulation getting crashed after running for 26 hours on HPC cluster

Register Blogs Community New Posts Updated Threads Search

Like Tree1Likes
  • 1 Post By HPE

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   August 22, 2020, 14:16
Default Simulation getting crashed after running for 26 hours on HPC cluster
  #1
New Member
 
shivamswarnakar72's Avatar
 
Shivam
Join Date: Mar 2019
Location: IN
Posts: 9
Rep Power: 7
shivamswarnakar72 is on a distinguished road
Hi allFoamers

I am using a buoyantBoussinesqPimpleFoam with Coriolis force added in UEqn.h file. Simulation is running fine in parallel (using 20 cores) on my lab system but when I try to run the same simulation on a HPC cluster(using any number of cores, lets say 80 for this case) the following error occurs after running about 26 hours of simulation time. Also the log file which we get while doing simulation is not showing any error. OpenFOAM version is 4.1 on lab system and as well as on HPC cluster. I am unable to figure out why this is happening, from where and also the solution.

This error file is generated by the HPC PBS job system.


PHP Code:
[0] [19] [15] [1] [4] [16] [5] [17] [3] [9] [20] [18] [7] [10] [14] [2] [8] [11] [55] [56] [49] [48] [40] [45] [41] [51] [12] [13] [6#####0##  ###0#  ##000  #  00#  00  #  #0#00#    0#  #00000        00    0          [52] [47] [60] [57] [59] [53] [43] [44] #####000###00##0#0#  0##00      ##0  00      0        0    0    Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&)Foam::error::printStack(Foam::Ostream&) addr2line failed
 
addr2line failed
 addr2line failed
[45#1  Foam::sigSegv::sigHandler(int)[47] #1  Foam::sigSegv::sigHandler(int)[44] #1  Foam::sigSegv::sigHandler(int) addr2line failed
[47#2  ? addr2line failed
[45#2  ? addr2line failed
[44#2  ? addr2line failed
[47#3  MPIDI_Cray_shared_mem_coll_bcast addr2line failed
[44#3  ? addr2line failed
[47#4  MPIR_CRAY_Allreduce addr2line failed
 
addr2line failed
[47#5  MPIR_Allreduce_impl addr2line failed
[47#6  MPI_Allreduce addr2line failed
[47#7  Foam::reduce(double&, Foam::sumOp<double> const&, int, int) addr2line failed
[47#8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const addr2line failed
[47#9  Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&) addr2line failed
[47#10  Foam::fvMatrix<double>::solve(Foam::dictionary const&)
[47#11  ?
[47#12  __libc_start_main addr2line failed

[10#1  [20] #1  Foam::sigSegv::sigHandler(int)Foam::sigSegv::sigHandler(int)[6] [0] ##11    Foam::sigSegv::sigHandler(int)Foam::sigSegv::sigHandler(int)[17] #1  [4] #1  Foam::sigSegv::sigHandler(int)[13] #1  [16] [15] #Foam::sigSegv::sigHandler(int)#1[8]   #Foam::sigSegv::sigHandler(int)1Foam::sigSegv::sigHandler(int)1[2] #[12] #[14] 1[19] #[7] 1  [18] #  Foam::sigSegv::sigHandler(int)[11] 1  Foam::sigSegv::sigHandler(int)#[5] #Foam::sigSegv::sigHandler(int)  [1] [9] ##Foam::sigSegv::sigHandler(int)1  1#[3] Foam::sigSegv::sigHandler(int)1  1  #11    1  #Foam::sigSegv::sigHandler(int)    Foam::sigSegv::sigHandler(int)Foam::sigSegv::sigHandler(int)Foam::sigSegv::sigHandler(int)1Foam::sigSegv::sigHandler(int)Foam::sigSegv::sigHandler(int)Foam::sigSegv::sigHandler(int)  Foam::sigSegv::sigHandler(int) at ??:?
[49#3  MPIDI_Cray_shared_mem_coll_bcast in "/lib64/libc.so.6"
[59#3  MPIDI_Cray_shared_mem_coll_bcast in "/lib64/libc.so.6"
[51#3  MPIDI_Cray_shared_mem_coll_bcast in "/lib64/libc.so.6"
[40#3 in   ?"/lib64/libc.so. in 6"
"/lib64/libc.so.6"
[52#3  MPIDI_Cray_shared_mem_coll_bcast[56] #3  MPIDI_Cray_shared_mem_coll_bcast in "/lib64/libc.so.6"
[20#3  MPIDI_Cray_shared_mem_coll_bcast at ??:?
[gnu/7.1/lib/libmpinu_71.so14"ch_gnu_71.sopt/cra"ch_gnu_71.so.3m
[51#c4.3/lib/li  /opMPIR_CRAY_Allreducet/crao.3"y/"pich-h  "bmMPIR_CRAY_Allreducepicy
/opt/cray/
gnu/7.1/lib_gnu_[53#
h_gnu_71.[59] /pe/mpt/7.7.10/gni[57#p4  p/MPIR_CRAY_Allreduce71.so.4  so.MPIR_CRAY_Allreduce3"#/mpich-gnu/7.e/me/lib3"

4  1/lib/libmpt/7mpmpiMPIR_CRAY_Allreducepich[55] .t/7.7.10/gni/ch_gnu_71._gnu_71.so.3"#74  mpich-gnu/so.3"

.MPIR_CRAY_Allreduce7.11/0l/igbn/il/imbpmipcihc-hg_ngun/u7_.7[4811#./4sl  ib/olMPIR_CRAY_Allreduce.i3b"m
pich_gnu_71.so.3"
[52] #4  MPIR_CRAY_Allreduce[56] #4  MPIR_CRAY_Allreduce in "
/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
[20] #4  MPIR_CRAY_Allreduce in  in "
/li" in  in b6/4lib""/l/libc.so.6"64/libc.so/lib64/lib64/libc.so.6
.6
"ibc."
so.6"

[2] #3  MPIDI_Cray_shared_mem_coll_bcast[16] #3[7]   #3MPIDI_Cray_shared_mem_coll_bcast  [3] #?3  MPIDI_Cray_shared_mem_coll_bcast in "
/li in b64/libc."s/ol.i6b"6
4
/libc.so.6"
[8] # in 3  MPIDI_Cray_shared_mem_coll_tree_reduce"
/li[6# in  in b64/"3  "/MPIDI_Cray_shared_mem_coll_bcastl in lib in "/lib"64/libc in /libib64/libc.so in c.so.6""/li in .so.664"/lib64/ in  in  in .6""//"
b64/""/libc.so.6" in lib in "/lib64""
/lib64/li/lib64/libc.so.6[15#l3ib64/libc.so.6"libc/lib6

c.so.6"[17] #/libc./lib64/llb[13] #[11] 3"  
.so.4/libc.so
3so.6
"
i in ib[9] c#  3  
MPIDI_Cray_shared_mem_coll_bcastMPIDI_Cray_shared_mem_coll_bcastMPIDI_Cray_shared_mem_coll_bcast6"
.6"[19]   b64/libc.so.#.3so.[10] #"

#cMPIDI_Cray_shared_mem_coll_bcast[5] #6"
  
6?3  /3MPIDI_Cray_shared_mem_coll_bcast.so.3  "?lib64/libc  6"[12#[18] #
[14#.MPIDI_Cray_shared_mem_coll_bcast
3  3  3  MPIDI_Cray_shared_mem_coll_bcastsMPIDI_Cray_shared_mem_coll_bcastMPIDI_Cray_shared_mem_coll_bcasto.6"
[1] #3  ?[4] #3  ?[0] #3  ? in "
/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
[60] #5  MPIR_Allreduce_impl in "
/op in  in ""/opt/cray/pe/mpt/7.7in  in  in "t"/cr/opt/cray/pe/mpt/7.7.10/gni/mpich-gn in 1""/o in /"/opt/cray/ay/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3" in u/70/gni/mpic/pt/crao/opt/crpe/mpt
"/opt/cray/pe/mpt/7.[49] .1/h-gnu/7.1/libopt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"y/pay/pe/mpt/7./7.7.10/gni/mpic7.10#l5/libmpich_gnu_71.so.3"
pe/mpt/t/cra7.10/gnih-gnu/7./gni/ib/li  
7y
[57#/mpic1/lib/lib[55] mpicb.7.10MPIR_Allreduce_impl/5  h-gnu/7.mpich_gnu_7#h-gnu/7.1/lmpic/pMPIR_Allreduce_impl1/lib/libmpich_gnu_71.so.3"1.so.3"5ibh_ggni/e/mpt/

  
/liMPIR_Allreduce_implnu_71.so.3"m7.7.10/gnib in mp[59] #
5p[51] #/mpichi"
/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
[20] #5  MPIR_Allreduce_implch_gnu_71.so.3"  
ich-gnu/7.5  -[53#
MPIR_Allreduce_impl1/gMPIR_Allreduce_impl5  lib/libmnu/7.1MPIR_Allreduce_impl in  in  in " in  in """/opt/crapi[48#/5l  cihb_/gMPIR_Allreduce_impllibmpinc/op in  in ""/opt/cra/opt/opt/yuh__7g1n.us_o7.13."s
o.3"
/t/"
/opt/cray/pe/mpt/7.7.10/gni[52#5  [56] #5MPIR_Allreduce_impl  /opt/cray/pe/mpt/7.7.10/gni/y/pMPIR_Allreduce_impl in "/e/mpt/7.7.10/gni/mp/cray/pe/mpt/7opt/cray/pe/mpt/7.7.10/gni/mp.7.10/gni/mpicray/pe/mpt/7.7.10/gnich-gnu/7.1/libi/mpich-gnu/7.1/lib/libmpich_gn/libmpich_gnu_71.so.3u_71.so.3"pe/mpt/7.7cray/pe/mpt/"
[60#6  7.7.10/gni/mpich-gnu/7./mpich-gnMPI_Allreduceich-gnu/7.1/lib/libmpich_gnu_71.so.3"ch-gnu/7.1/lib/li
.10/gni/mpic in 1/lib/libmpicu/[8#
bmpih-g"mh-g7.1/lib/l4ch_gnu_71[2] #nu/7.1/lib/opt/cray/pe/pich_gnu_nu/7.ibmpich_gnu_71.so.3"  .so.MPIDI_Cray_shared_mem_coll_reduce4/libmpic in mpt/7.71.so.3" in 1/"/o
 in 3
"  h_gMPIR_CRAY_Allreduce"7.10
lib
/libmpich_g[6pt/cray/pe/mp[16#"4/opt/cray/pe/mnu_71.so.3"/opt/cr/gni/mpich-n#t4/7.7
  
pt
MPIR_CRAY_Allreduceay
/gnu/7.1/lib/libu_71.  .10/gni/mpich-/7.7pe/mpt/7.7.10/gnmpich_gnu_71.so.3"[3] #soMPIR_CRAY_Allreducegnu/7.1.10/gni/mpici
4.3"
/h-gn/m  
lu
/7.1/lib/li[11pich-gnu/7.1/lMPIR_CRAY_Allreduceib/libmpich_gnu_7b#[15] 4ib/libm1.so.3"mpich_#  4  pich_g
gnu_71.so.3"MPIR_CRAY_AllreduceMPIR_CRAY_Allreducenu_71.so
.3"
[17#4[13]   #4  MPIR_CRAY_AllreduceMPIR_CRAY_Allreduce in "/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
 
in "/op in t/ in cr"/opt/cray/pe/mpt in ""/op in a"/opt/cray/[10] #//opt/cray/pe/mpt/7.7.10/gni/ in t/cy/pe/mpt/7.7.10/gni/mpp4  7.7.10/gni/mpicmp"ray/ipch in e/mptMPIR_CRAY_Allreduceh-gnu/7.1/libich-gnu/7.1/lib/libmpich_gnu_7/e/mpt/7.7.10/g-gnu/7."/7.7.10/gni/libmpich1.so.3"opt/crayin ni/mp1/lib/lib//_gnu_71.so.3"
p"ich-gmpicopt/craymp
e/opt/nu/7h_g/nich-gnu/[14] #c4  .[18] #pe/mu_/7.1/lib/mpt/7.7.r1/MPIR_CRAY_Allreduce4  pt/71.so.3"
libmpich_gnu10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"ay/lib/libMPIR_CRAY_Allreduce7.7.
_71.so
pe/mpt/7.7.10/gnmpich[19] 1.3i_gnu_71#0/gni/mpich"
/.4  -gnMPIR_CRAY_Allreduce
mpichso.3
"u/7.-
1/ligbn/ul/i7b.m1p/ilcihb_/glniub_m7p1i.csho_[12] .#g43n  "
u
_71MPIR_CRAY_Allreduce
.so.3"
 in "
/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
[55] #6  MPI_Allreduce in "
/opt/cray/pe in /mp"t//o7p.t7/.c1r0a/yg/npie//mmppitc/h7- in ."/ogn in 7"/opt/crau/7.1 in .1pt/cray/pe/mpy/pe/ in /""0/gni in t/7.7.10/gni/mpich-gnu/7.1 in mpt/7.lib/libmpich_gnu_71.so.3//op/mpich"/opt/cra/lib/lib"7."opt/cray/pe/mpt-gnu/7.1/lib/liby/pe/mp in mpich_gnu_71."/opt/cray/pe/mpt/7.7.10/gni/mpich-gnus/opt/cray/pe/7.1/lib/libmpich_gnu_71.so.3"
10/
 
in t[20#6  MPI_Allreduce"/crampich_gnu_71.sot/7.7.10/gni/mpich-o/mpt/7.gni/mpich-gnu//7/oy[57] #.3"gnu/7.1/li.3"7.107.1/lib.7.10/gni/mppt//pe6  
b/
/
gni/mpich-g/li[49#i[48] #cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"/MPI_Allreducelibmpich_gnu_71nu/7.1/lib/libmpich_g6c6  
mpt[60#7  .bmpinu_71  hMPI_Allreduce/so.3"
ch_gnu_7Foam::reduce(double&, Foam::sumOp<double> const&, intint).so.3"
-[59] g#n6u/MPI_Allreduce71.so.3"  
7.1/.l
[51#MPI_Allreduce7.10ib/libmpi6  /gni/mpic[53] #cMPI_Allreduceh-gnu6  h_MPI_Allreduce/7g.n1u/_l7i1b./sloi.b3m"p
ich_gnu_71.so.3"
[56] #6  MPI_Allreduce[52] #6  MPI_Allreduce in "
/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so in .3""
 
in /o"p/to/pctr/acyr/apy/pe[3]  in e/mpt/7.7.1 in /mpt/7.7 in #"" in  in "/"/opt/cr0/gni/mpic"in  in "/o in  in "/o".105  /opMPIR_Allreduce_implt/opt/cray/pe/mpt/7.7.10/gni/mpich in opt/ay/pe/mpt/7.7.10/gni/mpich-gnu/7.1/h-gnu/7.opt/cray/"pt/cray/pe/mp in p"/op/gni/mpich-gnu/cray/pe/mpt/7.7-g"/optcray/pe/mplib/lib1/lib/p/opt/cray/pe/mpt/7.7.10t/cra/opt/cray/pe/mpt/7t/cray/7.1.10/nu/7.1/lib/libm/cray/petmple/mt/7.7.10/gni/my/pe/mpt/7.7.10/gni/mpich//lib/libmpignipi/mpt/7.7.10/gni/7.7.10/gniich_ibmpt/7.7.10/gni/pic.7.-pe/ch_gnu_71.so./mpich-gnu/7.1/ch_gnu_71.so.3"/mpich-g/gnu_7p/mpich-gh-gnu/7.1/lib/libmpi1gnu/7.1/lib/libmpt/7.7.10/gni/3l
nu/7.1mpich-1.so.3"ich_gnu_71.so.3gnu/7.1/lib/libmpich_gnu_7ni/mpich-gnu/7.1ch0/gni/mmpich-gnu"ib/li/lib/lgnu
[8"1.so/lib/libmpich[15] #_gnu_71.so.3"mpich-gnu/7.1/lib/libmpich_gnu_7pich_gnu_71.so.3"/7.
bmpich_gnu_71.i/7.1[11] ##
5.3"
_gn5
1.so.3
"
1/lib/libmpich_gnu_71.so.3"
so.bmpich/lib/libmp5    
u_71
.s  
[6] [17#[12] 5
3"_gn[2] u_ich_MPIR_Allreduce_implMPIR_CRAY_Allreduceo[18] .3"
#[19] 5MPIR_Allreduce_impl#  
#71.gnu_71.[13] ###5  5  [10] #MPIR_Allreduce_impl5  so.3"
so5  5  MPIR_Allreduce_implMPIR_Allreduce_impl5  MPIR_Allreduce_impl.3"  MPIR_Allreduce_impl[14] #MPIR_Allreduce_implMPIR_Allreduce_implMPIR_Allreduce_impl
5  MPIR_Allreduce_impl[16] #5  MPIR_Allreduce_impl in ibm[16] [18] #h_gn6  pich_g#6  u_MPI_AllreduceMPI_Allreducenu_716  71.so.3[59] #8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const"
.so.3"MPI_Allreduce

[13] #[12] #6 at ??:?
[56] #8    6  MPI_AllreduceMPI_Allreduce[20] #8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) constFoam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const[52] #8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const in "
/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
[3] #7  Foam::reduce(double&, Foam::sumOp<double> const&, int, int) in "
/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
[8] #7  MPI_Allreduce in "
/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
[6] #7  Foam::reduce(double&, Foam::sumOp<double> const&, int, int) at ??:?
 in "
/opt/cray/pe/mpt/7.7.10/gni/mpic at ??:?
h-gnu/7.1/lib/libmpich_gnu_71.so.3"
[11] #7  Foam::reduce(double&, Foam::sumOp<double> const&, int, int) in "
/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
[10] #7  Foam::reduce(double&, Foam::sumOp<double> const&, int, int)[60] #9  Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&) in "
/opt/cray/pe/mpt/7.7.10/gni/mpic[55#9  h-gnu/7.1/lib/libmpich_gnu_71.so.3"
Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&)[17#7   in "/Foam::reduce(double&, Foam::sumOp<double> const&, int, int)opt/cray/pe/m at pt??:?/
7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
[2] #7  Foam::reduce(double&, Foam::sumOp<double> const&, int, int)[3] #8   at ??:?
Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const in "
/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
[14] #7   at ??:?
Foam::reduce(double&, Foam::sumOp<double> const&, int, int)[57] #9  Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&) in "
/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib/libmpich_gnu_71.so.3"
[19] #7 in   "
/opt/crFoam::reduce(double&, Foam::sumOp<double> const&, intint)ay/pe/mpt/7. in 7."1/0o/pgt in ni/mpich-gnu/7 in /c in ""/opt.1/lib/li"ra[48#9  y/pe/mpt/opt/cray/pe/mFoam::fvMatrix<double>::solveSegregated(Foam::dictionary const&)p/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lbmpich_gnu_71./opt/cray/p/7.7.1tib/libmpich_gnu_71.so.3"se/mpt/7.70/gni/mpich-g/7.7.10/g
o.3".10/gnu/7.1/lib/libmni/mpich-gnu/7.1/lib/libmpi[18] 
nipicc#/mpi[15] #h_gnuh_gnu_71.7  ch-gnu/7.17  _71.so.3"
so.3"Foam::reduce(double&, Foam::sumOp<double> const&, int, int)/liFoam::reduce(double&, Foam::sumOp<double> const&, int, int)

b/l[16] i#b[12] m7#pi7ch_gnu_7    1.so.3Foam::reduce(double&, Foam::sumOp<double> const&, int, int)"
Foam::reduce(double&, Foam::sumOp<double> const&, intint)[13#7  Foam::reduce(double&, Foam::sumOp<double> const&, int, int) at ??:?
 
in "/opt/cray/pe/mpt/7.7.10/gni/mpich-gnu/7.1/lib at ??:?/libmpich_gnu_71.so.3"
[8#8  
Foam::reduce(double&, Foam::sumOp<double> const&, intintat ??:?
[
51#9  Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&)[49] #9  Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&) at ??:?
 
at ??:?
[
53#9  Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&) at ??:?
 
at ??:?
[
6#8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const[56] #9  Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&)[59] #9  Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&)[52] #9  Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&) at ??:?
[11#8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const at ??:?
[10#8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const at ??:?
 
at ??:?
[
17#8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const[2] #8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const at ??:?
 
at ??:?
[
14#8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const at ??:?
[20#9  Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&)[19] #8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const_pmiu_daemon(SIGCHLD): [NID 00131] [c0-0c2s0n3] [Thu Aug 20 13:53:23 2020] PE RANK 47 exit signal Segmentation fault
 
at ??:?
[
18#8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const[NID 00131] 2020-08-20 13:53:23 Apid 1186137: initiated application termination
 
at ??:?
 
at ??:?
[
15#8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const[16] #8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const at  at ??:???:?

 
at ??:?
[
12#8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const[13] #8  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const[8] #9  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const 

log file generated by openFOAM. I have posted only last few iterations because whole file was of 900MB.


PHP Code:
DICPCG:  Solving for p_rghInitial residual 0.00922881, Final residual 8.92923e-05No Iterations 92
DICPCG
:  Solving for p_rghInitial residual 0.0001718, Final residual 1.63557e-06No Iterations 130
time step continuity errors 
sum local 1.68162e-08, global = 1.21262e-19cumulative = -1.10416e-09
DICPCG
:  Solving for p_rghInitial residual 0.000542859, Final residual 5.28707e-06No Iterations 127
DICPCG
:  Solving for p_rghInitial residual 1.76365e-05, Final residual 1.72252e-07No Iterations 260
time step continuity errors 
sum local 1.58935e-09, global = 2.18136e-19cumulative = -1.10416e-09
DICPCG
:  Solving for p_rghInitial residual 2.11471e-05, Final residual 2.04902e-07No Iterations 126
DICPCG
:  Solving for p_rghInitial residual 6.85424e-07, Final residual 9.99586e-09No Iterations 258
time step continuity errors 
sum local 9.22404e-11, global = 2.28274e-19cumulative = -1.10416e-09
ExecutionTime 
93898.9 s  ClockTime 93925 s

Courant Number mean
0.107378 max0.496735
deltaT 
0.0358475
Time 
222861

PIMPLE
iteration 1
DILUPBiCGStab
:  Solving for UxInitial residual 0.0176624, Final residual 1.84718e-09No Iterations 3
DILUPBiCGStab
:  Solving for UyInitial residual 0.0263027, Final residual 8.16145e-09No Iterations 3
DILUPBiCGStab
:  Solving for UzInitial residual 0.0219096, Final residual 9.51366e-10No Iterations 3
DILUPBiCGStab
:  Solving for TInitial residual 0.00450249, Final residual 1.22569e-09No Iterations 3
DICPCG
:  Solving for p_rghInitial residual 0.00931455, Final residual 8.85029e-05No Iterations 92
DICPCG
:  Solving for p_rghInitial residual 0.000171408, Final residual 1.60019e-06No Iterations 130
time step continuity errors 
sum local 1.65673e-08, global = -1.5034e-20cumulative = -1.10416e-09
DICPCG
:  Solving for p_rghInitial residual 0.000551684, Final residual 5.50028e-06No Iterations 127
DICPCG
:  Solving for p_rghInitial residual 1.79373e-05, Final residual 1.75943e-07No Iterations 259
time step continuity errors 
sum local 1.6435e-09, global = -3.46438e-20cumulative = -1.10416e-09
Application 1186137 
exit codes139
Application 1186137 
exit signalsKilled
Application 1186137 resources
utime ~87770sstime ~6170sRss ~52120inblocks ~506outblocks ~36088 


Your help is much required.
Thanks in advance
shivamswarnakar72 is offline   Reply With Quote

Old   August 22, 2020, 16:17
Default
  #2
HPE
Senior Member
 
HPE's Avatar
 
Herpes Free Engineer
Join Date: Sep 2019
Location: The Home Under The Ground with the Lost Boys
Posts: 932
Rep Power: 12
HPE is on a distinguished road
Segmentation fault seems to occur, that means a signal for memory access violation, trying to read or write from/to a memory area that your process does not have access to. These are not C or C++ exceptions and you can’t catch signals to produce very good error messages.

- might be just because of hpc fault plus MPI combination during some operation or maybe a write out. might be a sheer luck, or some bug that has already been resolved in OF or MPI or compiler. Please note that OFv4.1 is no longer supported. may be MPI or compiler is newer than the OF version.
- were u able to reproduce the error with 20 procs in hpc for the exact setup and exact compiled version? You can check HEAD commit ID to ensure both OF versions are the same.
- were the compiler of both systems the same?
- were the versions and types of MPI the same?
- were you able to reproduce the same error in the hpc at the same timestep by resubmitting the case?
- were you able to restart the simulation from the previous timestep?
- any chance to use a newer version since Coriolis force is now available as an fvOption?
HPE is offline   Reply With Quote

Old   August 22, 2020, 23:21
Default
  #3
New Member
 
shivamswarnakar72's Avatar
 
Shivam
Join Date: Mar 2019
Location: IN
Posts: 9
Rep Power: 7
shivamswarnakar72 is on a distinguished road
Quote:
Originally Posted by HPE View Post
Segmentation fault seems to occur, that means a signal for memory access violation, trying to read or write from/to a memory area that your process does not have access to. These are not C or C++ exceptions and you can’t catch signals to produce very good error messages.

- might be just because of hpc fault plus MPI combination during some operation or maybe a write out. might be a sheer luck, or some bug that has already been resolved in OF or MPI or compiler. Please note that OFv4.1 is no longer supported. may be MPI or compiler is newer than the OF version.
- were u able to reproduce the error with 20 procs in hpc for the exact setup and exact compiled version? You can check HEAD commit ID to ensure both OF versions are the same.
- were the compiler of both systems the same?
- were the versions and types of MPI the same?
- were you able to reproduce the same error in the hpc at the same timestep by resubmitting the case?
- were you able to restart the simulation from the previous timestep?
- any chance to use a newer version since Coriolis force is now available as an fvOption?
Hi HPE

Thanks for your reply.

1. Yes I was able to reproduce the error with 20 proc in HPC for exact same setup and same compiled version i.e. OF4.1

2. Compiler used were different. In HPC gcc 7.2.0 is used and in Lab system gcc 5.4.0 is used.

3.GCC compiler is used but versions were different.

4. MPI is also different. In HPC cray_MPICH is used and in lab system SYSTEMOPENMPI is used.

5.No, not at same time but simulation crashes eventually.

6. Yes I was able to restart the simulation from previous time step, but it will again fails after about 20 hours.

7. I can try using new version of OpenFOAM. Which version should I use ? I want to use buoyantBoussinesqPimpleFoam and I think it is removed from newer versions. If it is available then I can switch to newer version.

Thanks Again
Shivam
shivamswarnakar72 is offline   Reply With Quote

Old   August 23, 2020, 00:03
Default
  #4
HPE
Senior Member
 
HPE's Avatar
 
Herpes Free Engineer
Join Date: Sep 2019
Location: The Home Under The Ground with the Lost Boys
Posts: 932
Rep Power: 12
HPE is on a distinguished road
Hi,

I think the culprit might be the item 4. If you are able to switch to OpenMPI in the HPC, I would expect the error would go away. Item 2 might be another reason, but less likely to my feeling.

buoyantBoussinesqPimpleFoam is still available in OFv2006, alongside atmCoriolisUSource fvOption. Have a look through the content in gitlab link below, if you are interested in.
shivamswarnakar72 likes this.
HPE is offline   Reply With Quote

Old   August 23, 2020, 00:11
Default
  #5
New Member
 
shivamswarnakar72's Avatar
 
Shivam
Join Date: Mar 2019
Location: IN
Posts: 9
Rep Power: 7
shivamswarnakar72 is on a distinguished road
Quote:
Originally Posted by HPE View Post
Hi,

I think the culprit might be the item 4. If you are able to switch to OpenMPI in the HPC, I would expect the error would go away. Item 2 might be another reason, but less likely to my feeling.

buoyantBoussinesqPimpleFoam is still available in OFv2006, alongside atmCoriolisUSource fvOption. Have a look through the content in gitlab link below, if you are interested in.
Hi,

I also wanted to change the MPI but HPC admistrator says that we can only use CRAY_MPICH.

Sure I will check the OFv2006 and fvOption.

Thanks again for your suggestions.
shivamswarnakar72 is offline   Reply With Quote

Reply

Tags
addr2line failed, hpc cluster, parallel computing, segmentaion fault, sigfpe


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Running PBS job in parallel on HPC cluster silviliril OpenFOAM Running, Solving & CFD 11 August 9, 2019 11:50
[OpenFOAM.org] Problems with installing openfoam 5.0 on HPC Cluster sjlouie91 OpenFOAM Installation 4 January 20, 2019 15:35
What do you CFD guys do during a long simulation running? bearcat Main CFD Forum 5 July 23, 2009 08:08
Kubuntu uses dash breaks All scripts in tutorials platopus OpenFOAM Bugs 8 April 15, 2008 07:52
star is not running the simulation in windows Arnab Siemens 1 August 2, 2004 02:40


All times are GMT -4. The time now is 22:35.