CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Pre-Processing

SnappyHexmesh crashes with many processes

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree2Likes
  • 2 Post By wyldckat

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   June 14, 2012, 19:29
Default SnappyHexmesh crashes with many processes
  #1
New Member
 
Jos Ewert
Join Date: Jun 2012
Posts: 5
Rep Power: 14
flami is on a distinguished road
Hi,

I am currently trying the tutorials on our cluster, but on the motorbike example in incompressible/pisoFoam/les/ generating the mesh seems to fail with 208 threads.
I increased the blockmesh in blockmeshdict by doubling the 3 values.
If run snappyhexmesh with 208 threads and I will get a large stacktrace after iteration 5, probably right before going into iteration 6.
Strangely enough it runs perfectly fine singlethreaded and with only 104 processes.
Snappyhexmesh does not seem to run with less processes then subdomains, unless it is only 1 process.

Does anyone have an idea what could go wrong?

heres the stacktrace:
Code:
    6   18370624
[137] #0  Foam::error::printStack(Foam::Ostream&) in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/lib
OpenFOAM.so"
[137] #1  Foam::sigSegv::sigHandler(int) in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libOpenFOAM.
so"
[137] #2  [135] #0  Foam::error::printStack(Foam::Ostream&) in "/lib64/libc.so.6"
[137] #3  _SCOTCHdgraphMatchSyncColl in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libOpenFOAM.so"
[135] #1  Foam::sigSegv::sigHandler(int) in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libp
tscotch.so"
[137] #4  _SCOTCHdgraphCoarsen in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so
"
[137] #5   in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libOpenFOAM.so"
[135] #2   at bdgraph_bipart_ml.c:0
[137] #6   in "/lib64/libc.so.6"
[135] #3  _SCOTCHdgraphMatchSyncColl at bdgraph_bipart_ml.c:0
[137] #7  _SCOTCHbdgraphBipartMl in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.
so"
[135] #4  _SCOTCHdgraphCoarsen in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so
"
[137] #8  _SCOTCHbdgraphBipartSt in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.
so"
[135] #5   in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
[137] #9   at bdgraph_bipart_ml.c:0
[135] #6   at kdgraph_map_rb_part.c:0
[137] #10   at bdgraph_bipart_ml.c:0
[135] #7  _SCOTCHbdgraphBipartMl at kdgraph_map_rb_part.c:0
[137] #11  _SCOTCHkdgraphMapRbPart in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotc
h.so"
[135] #8  _SCOTCHbdgraphBipartSt in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.
so"
[137] #12  _SCOTCHkdgraphMapSt in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so
"
[135] #9   in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
[137] #13  SCOTCH_dgraphMapCompute at kdgraph_map_rb_part.c:0
[135] #10   in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
[137] #14  SCOTCH_dgraphMap at kdgraph_map_rb_part.c:0
[135] #11  _SCOTCHkdgraphMapRbPart in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotc
h.so"
[137] #15  Foam::ptscotchDecomp::decompose(Foam::fileName const&, Foam::List<int> const&, Foam::List<int> const&, Foam::Field<double> const&, F
oam::List<int>&) const in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchDecomp.so"
[137] #16  Foam::ptscotchDecomp::decomposeZeroDomains(Foam::fileName const&, Foam::List<int> const&, Foam::List<int> const&, Foam::Field<double
> const&, Foam::List<int>&) const in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch
.so"
[135] #12  _SCOTCHkdgraphMapSt in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchDecom
p.so"
[137] #17  Foam::ptscotchDecomp::decompose(Foam::polyMesh const&, Foam::Field<Foam::Vector<double> > const&, Foam::Field<double> const&) in "/h
ome/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
[135] #13  SCOTCH_dgraphMapCompute in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchD
ecomp.so"
[137] #18  Foam::meshRefinement::balance(bool, bool, Foam::Field<double> const&, Foam::decompositionMethod&, Foam::fvMeshDistribute&) in "/home
/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
[135] #14  SCOTCH_dgraphMap in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
 in "/home/kluster/openfoam/Inte[135] #15  lMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"Foam::ptscotchDecomp::deco
mpose(Foam::fileName const&, Foam::List<int> const&, Foam::List<int> const&, Foam::Field<double> const&, Foam::List<int>&) const
[137] #19  Foam::meshRefinement::refineAndBalance(Foam::string const&, Foam::decompositionMethod&, Foam::fvMeshDistribute&, Foam::List<int> con
st&, double) in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchDecomp.so"
[135] #16  Foam::ptscotchDecomp::decomposeZeroDomains(Foam::fileName const&, Foam::List<int> const&, Foam::List<int> const&, Foam::Field<double
> const&, Foam::List<int>&) const in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[137] #20  Foam::autoRefineDriver::surfaceOnlyRefine(Foam::refinementParameters const&, int) in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOA
M-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchDecomp.so"
[135] #17  Foam::ptscotchDecomp::decompose(Foam::polyMesh const&, Foam::Field<Foam::Vector<double> > const&, Foam::Field<double> const&) in "/h
ome/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[137] #21  Foam::autoRefineDriver::doRefine(Foam::dictionary const&, Foam::refinementParameters const&, bool, Foam::dictionary const&) in "/hom
e/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchDecomp.so"
[135] #18  Foam::meshRefinement::balance(bool, bool, Foam::Field<double> const&, Foam::decompositionMethod&, Foam::fvMeshDistribute&) in "/home
/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[137] #22   in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[135] #19  Foam::meshRefinement::refineAndBalance(Foam::string const&, Foam::decompositionMethod&, Foam::fvMeshDistribute&, Foam::List<int> con
st&, double)
[137]  in "/home/kluster/openfoam/IntelMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/bin/snappyHexMesh"
[137] #23  __libc_start_main in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[135] #20  Foam::autoRefineDriver::surfaceOnlyRefine(Foam::refinementParameters const&, int) in "/lib64/libc.so.6"
[137] #24   in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[135] #21  Foam::autoRefineDriver::doRefine(Foam::dictionary const&, Foam::refinementParameters const&, bool, Foam::dictionary const&)
 in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[135] #22  [137]  at /usr/src/packages/BUILD/glibc-2.11.3/csu/../sysdeps/x86_64/elf/start.S:116
flami is offline   Reply With Quote

Old   June 14, 2012, 19:51
Default
  #2
Assistant Moderator
 
Bernhard Gschaider
Join Date: Mar 2009
Posts: 4,225
Rep Power: 51
gschaider will become famous soon enoughgschaider will become famous soon enough
Quote:
Originally Posted by flami View Post
Hi,

I am currently trying the tutorials on our cluster, but on the motorbike example in incompressible/pisoFoam/les/ generating the mesh seems to fail with 208 threads.
I increased the blockmesh in blockmeshdict by doubling the 3 values.
If run snappyhexmesh with 208 threads and I will get a large stacktrace after iteration 5, probably right before going into iteration 6.
Strangely enough it runs perfectly fine singlethreaded and with only 104 processes.
Snappyhexmesh does not seem to run with less processes then subdomains, unless it is only 1 process.

Does anyone have an idea what could go wrong?
No idea at all. Just one hint: try to do it with 209 processes (no joke). If that doesn't fail then the number of processes is not your problem (which I don't think it is), but you stumbled upon a bug in the algorithm. Good luck

Quote:
Originally Posted by flami View Post
heres the stacktrace:
Code:
    6   18370624
[137] #0  Foam::error::printStack(Foam::Ostream&) in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/lib
OpenFOAM.so"
[137] #1  Foam::sigSegv::sigHandler(int) in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libOpenFOAM.
so"
[137] #2  [135] #0  Foam::error::printStack(Foam::Ostream&) in "/lib64/libc.so.6"
[137] #3  _SCOTCHdgraphMatchSyncColl in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libOpenFOAM.so"
[135] #1  Foam::sigSegv::sigHandler(int) in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libp
tscotch.so"
[137] #4  _SCOTCHdgraphCoarsen in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so
"
[137] #5   in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libOpenFOAM.so"
[135] #2   at bdgraph_bipart_ml.c:0
[137] #6   in "/lib64/libc.so.6"
[135] #3  _SCOTCHdgraphMatchSyncColl at bdgraph_bipart_ml.c:0
[137] #7  _SCOTCHbdgraphBipartMl in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.
so"
[135] #4  _SCOTCHdgraphCoarsen in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so
"
[137] #8  _SCOTCHbdgraphBipartSt in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.
so"
[135] #5   in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
[137] #9   at bdgraph_bipart_ml.c:0
[135] #6   at kdgraph_map_rb_part.c:0
[137] #10   at bdgraph_bipart_ml.c:0
[135] #7  _SCOTCHbdgraphBipartMl at kdgraph_map_rb_part.c:0
[137] #11  _SCOTCHkdgraphMapRbPart in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotc
h.so"
[135] #8  _SCOTCHbdgraphBipartSt in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.
so"
[137] #12  _SCOTCHkdgraphMapSt in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so
"
[135] #9   in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
[137] #13  SCOTCH_dgraphMapCompute at kdgraph_map_rb_part.c:0
[135] #10   in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
[137] #14  SCOTCH_dgraphMap at kdgraph_map_rb_part.c:0
[135] #11  _SCOTCHkdgraphMapRbPart in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotc
h.so"
[137] #15  Foam::ptscotchDecomp::decompose(Foam::fileName const&, Foam::List<int> const&, Foam::List<int> const&, Foam::Field<double> const&, F
oam::List<int>&) const in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchDecomp.so"
[137] #16  Foam::ptscotchDecomp::decomposeZeroDomains(Foam::fileName const&, Foam::List<int> const&, Foam::List<int> const&, Foam::Field<double
> const&, Foam::List<int>&) const in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch
.so"
[135] #12  _SCOTCHkdgraphMapSt in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchDecom
p.so"
[137] #17  Foam::ptscotchDecomp::decompose(Foam::polyMesh const&, Foam::Field<Foam::Vector<double> > const&, Foam::Field<double> const&) in "/h
ome/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
[135] #13  SCOTCH_dgraphMapCompute in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchD
ecomp.so"
[137] #18  Foam::meshRefinement::balance(bool, bool, Foam::Field<double> const&, Foam::decompositionMethod&, Foam::fvMeshDistribute&) in "/home
/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
[135] #14  SCOTCH_dgraphMap in "/home/kluster/openfoam/IntelMPI-gcc46//ThirdParty-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotch.so"
 in "/home/kluster/openfoam/Inte[135] #15  lMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"Foam::ptscotchDecomp::deco
mpose(Foam::fileName const&, Foam::List<int> const&, Foam::List<int> const&, Foam::Field<double> const&, Foam::List<int>&) const
[137] #19  Foam::meshRefinement::refineAndBalance(Foam::string const&, Foam::decompositionMethod&, Foam::fvMeshDistribute&, Foam::List<int> con
st&, double) in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchDecomp.so"
[135] #16  Foam::ptscotchDecomp::decomposeZeroDomains(Foam::fileName const&, Foam::List<int> const&, Foam::List<int> const&, Foam::Field<double
> const&, Foam::List<int>&) const in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[137] #20  Foam::autoRefineDriver::surfaceOnlyRefine(Foam::refinementParameters const&, int) in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOA
M-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchDecomp.so"
[135] #17  Foam::ptscotchDecomp::decompose(Foam::polyMesh const&, Foam::Field<Foam::Vector<double> > const&, Foam::Field<double> const&) in "/h
ome/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[137] #21  Foam::autoRefineDriver::doRefine(Foam::dictionary const&, Foam::refinementParameters const&, bool, Foam::dictionary const&) in "/hom
e/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/4.0.3/libptscotchDecomp.so"
[135] #18  Foam::meshRefinement::balance(bool, bool, Foam::Field<double> const&, Foam::decompositionMethod&, Foam::fvMeshDistribute&) in "/home
/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[137] #22   in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[135] #19  Foam::meshRefinement::refineAndBalance(Foam::string const&, Foam::decompositionMethod&, Foam::fvMeshDistribute&, Foam::List<int> con
st&, double)
[137]  in "/home/kluster/openfoam/IntelMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/bin/snappyHexMesh"
[137] #23  __libc_start_main in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[135] #20  Foam::autoRefineDriver::surfaceOnlyRefine(Foam::refinementParameters const&, int) in "/lib64/libc.so.6"
[137] #24   in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[135] #21  Foam::autoRefineDriver::doRefine(Foam::dictionary const&, Foam::refinementParameters const&, bool, Foam::dictionary const&)
 in "/home/kluster/openfoam/IntelMPI-gcc46//OpenFOAM-2.1.0/platforms/linux64Gcc46DPOpt/lib/libautoMesh.so"
[135] #22  [137]  at /usr/src/packages/BUILD/glibc-2.11.3/csu/../sysdeps/x86_64/elf/start.S:116
Next step would be to compile a debug-version. That would allow you to see the exact line in the source code where the problem occurred.
gschaider is offline   Reply With Quote

Old   June 16, 2012, 05:47
Default
  #3
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,978
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Greetings to all!

@flami - Here's what I know:
Quote:
Originally Posted by flami View Post
I am currently trying the tutorials on our cluster, but on the motorbike example in incompressible/pisoFoam/les/ generating the mesh seems to fail with 208 threads.
Threads? Hold it! Let's get some details straighten out first:
  1. OpenFOAM uses MPI for running multiple processes in parallel. These processes can be anywhere in network-sight, so they can communicate and collaborate in solving the problem.
  2. The official version of OpenFOAM does not yet (AFAIK) support OpenMP, nor explicit multi-threading! I.e., one process that has more than one thread.
  3. Many of todays Intel CPUs have Hyper-Threading, namely each CPU core has the ability to do a certain limited set of instructions in 2 parallel threads.
    As a very abstract example (i.e., not a real one): HT might be able to fetch data from memory and add 1+1 with 2 threads at the same time; but it can't do 2x2 with the 2 threads in parallel, so it will have to schedule one thread at a time. For more specific details: http://en.wikipedia.org/wiki/Hyper-threading
  4. Therefore, again as an example, if the CPU on your machine has 4 cores with HT, then it has 8 HThreads. This means that you could run up to 8 OpenFOAM processes in parallel, occupying the full processing power.
    But with OpenFOAM, this can actually be slower, because OpenFOAM mostly needs to do the instructions that have to be done one HThread (per core) at a time.
  5. I very vaguely remember reading about IntelMPI having the ability to automagically run applications with simultaneous multi-threading and MPI, but I think it implies OpenMP for multi-threading...
    If by any chance IntelMPI actually loads the two processes as if they were a single process, running each process as a thread, this will mean that it will be sharing the same libraries on the same process HEAP/STACK/Thread space... which OpenFOAM isn't prepared to handle.

Quote:
Originally Posted by flami View Post
Strangely enough it runs perfectly fine singlethreaded and with only 104 processes.
Again, do you mean:
  • Code:
    mpirun -np 104 snappyHexMesh
  • Or
    Code:
    mpirun -np 104 snappyHexMesh -parallel
Best regards,
Bruno
__________________
wyldckat is offline   Reply With Quote

Old   June 16, 2012, 08:13
Default
  #4
New Member
 
Jos Ewert
Join Date: Jun 2012
Posts: 5
Rep Power: 14
flami is on a distinguished road
Sorry I made a typo , instead of threads I mean processes. I dont use any kind of openmp or other thread parallelisation. Its all MPI. I use IntelMPI.

I run this from the allrun script:
runParallel snappyHexMesh 104 -overwrite -parallel

which then becomes:
mpirun -np 104 -ppn 8 -binding "map=scatter" snappyHexMesh 104 -overwrite -parallel

anything above 104 processes so 130, 156 , 182, 208 doesn't seem to crash anymore but get stuck in an iteration ( it seems to be the same for each, so e.g. 208 always gets stuck in iteration 4 of the first step [sorry forgot what it was called ]).
They get stuck at 100% core usage , and don't progress. The first time e.g. 208 processes were stuck for 4 hours until I canceled the job.
That stacktrace suddenly appeared after I changed the mesh in blockmeshdict. changing it again made snappyhexmesh get stuck again.

It doesn't matter if I use gcc or the intel compilers. It always is the same error.

I might try openmpi too if I get the time. maybe I get lucky.
flami is offline   Reply With Quote

Old   June 16, 2012, 08:27
Default
  #5
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,978
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Ah, OK, now we're getting closer to the problem

Do the CPUs you're using have Hyper-Threading? If so, does this mean that anything above 104 processes will strictly require using HT?

And... does your mpirun command use the "machines" or "hosts" file? Or do you have this globally configured? Or does your machine indeed have 104 or 208 cores?
__________________
wyldckat is offline   Reply With Quote

Old   June 16, 2012, 14:05
Default
  #6
New Member
 
Jos Ewert
Join Date: Jun 2012
Posts: 5
Rep Power: 14
flami is on a distinguished road
Yes we have 208 physical cores. we use slurm to distribute the mpi processes ( it sets some environment variabels that intelMPI reads ) .
With 104 processes I explicitly run 8 processes per node that get pinned to 4 processes per CPU . So I only use 1/2 of the cores per CPU.
flami is offline   Reply With Quote

Old   June 16, 2012, 17:26
Default
  #7
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,978
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Mmm... OK, then there are a few possibilities left:
  • Hardware issue: network or RAM might be creating an unexpected bottleneck... but I doubt this is the issue.
  • Wait, you've doubled the original resolution of the base mesh... then that's 40x16x16 = 10240 cells / 208 processors ~= 49 cells/processor? My guess is that there are processors that are only left with 1 or 2 cells, which I think snappy+scotch aren't expecting!
  • If even after substantially increasing base resolution (don't forget to then reduce a bit the resolution on "snappyHexMeshDict"), then it could be a limitation with Scotch: snappy uses the ptscotch library, for (somehow) keeping the mesh evenly distributed between processors. There could be a hard coded limit of perhaps 128 sub-domains on ptscotch.
There should be a few more possibilities, but currently I can't think about them...
__________________
wyldckat is offline   Reply With Quote

Old   June 23, 2012, 14:52
Default
  #8
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,978
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Greetings to all!

Yesterday I was writing some stuff about "decomposeParDict" and went to look at the main one present in the folder "OpenFOAM-2.1.x/applications/utilities/parallelProcessing/decomposePar", which I hadn't looked at it for several months now... and I saw this:
Quote:
Code:
// method          multiLevel;
// method          structured;  // does 2D decomposition of structured mesh

multiLevelCoeffs
{
    // Decomposition methods to apply in turn. This is like hierarchical but
    // fully general - every method can be used at every level.

    level0
    {
        numberOfSubdomains  64;
        //method simple;
        //simpleCoeffs
        //{
        //    n           (2 1 1);
        //    delta       0.001;
        //}
        method scotch;
    }
    level1
    {
        numberOfSubdomains  4;
        method scotch;
    }
}
multiLevel ... there's this new multiLevel method!!!!

@flami: I haven't tested this yet, but I suggest that you give this method a try, because this might very well be a innocent way of showing people that when scotch crashes due to a partition limit, we have to resort to the multiLevel method! Which apparently provides the ability to use Scotch in a multi level partition graph, instead of a single level partition graph!

Oh, since you were using it on snappyHexMesh, instead of "scotch" will probably have to be "ptscotch" The other possibility is to use "ptscotch" in one level and "simple" or "hierarchical" in the other.

Best regards,
Bruno
alsaeng and mgg like this.
__________________
wyldckat is offline   Reply With Quote

Old   July 2, 2012, 20:30
Default
  #9
New Member
 
Jos Ewert
Join Date: Jun 2012
Posts: 5
Rep Power: 14
flami is on a distinguished road
Sadly I do not have the hardware anymore ( it was torn apart and send to whoknowswhere ) .
I might be able to test it on some other hardware, but sadly I cannot make any promises .

Anyway I found out that at 128 with ptscotch, you reached the limit of when the program will run. Anything above will make it crash ( e.g. 130) , no matter the problem size. I had the issue with the defaults being too small for 128, that is why I doubled every side .

but yes multiLevelCoeffs might help.
For lazy people that don't understand how the problem is actually structured (i.e. me ) something like this:
Code:
 
    level0
    {
       [...]
        method ptscotch;
    }
    level1
    {
        [...]
        method ptscotch;
    }
might help for snappyhexmesh, and for decompose par with the regular "scotch".

thanks for the help.
flami is offline   Reply With Quote

Old   July 18, 2012, 14:26
Default
  #10
New Member
 
Jos Ewert
Join Date: Jun 2012
Posts: 5
Rep Power: 14
flami is on a distinguished road
Hi, I have access to a larger machine again and tested the multilevel suggestion.
Sadly it still crashes , but it seems at a different place:

Code:
Surface refinement iteration 0
------------------------------

Marked for refinement due to surface intersection : 630 cells.
Marked for refinement due to curvature/regions    : 0 cells.
Determined cells to refine in = 0.18 s
Selected for refinement : 630 cells (out of 655360)
Edge intersection testing:
    Number of edges             : 2005059
    Number of edges to retest   : 17979
    Number of intersected edges : 3239
Refined mesh in = 0.56 s
After refinement surface refinement iteration 0 : cells:659770  faces:2005059  points:685831
Cells per refinement level:
    0   654730
    1   5040
[0] Decomposition at level 0 :
[0] 
[0] 
[0] --> FOAM FATAL ERROR: 
[0] bad set size -4
[0] 
[0]     From function List<T>::setSize(const label)
[0]     in file /home/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/src/OpenFOAM/lnInclude/List.C at line [0]     Domain 0
[0]         Number of cells = 40976
[0]         Number of inter-domain patches = 0
[0]         Number of inter-domain faces = 0
[0] 
322.
[0] 
FOAM parallel run aborting
[0] 
[0] #0  Foam::error::printStack(Foam::Ostream&)--------------------------------------------------------------------------
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process.  Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption.  The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.  

The process that invoked fork was:

  Local host:          ic1n045 (PID 3374)
  MPI_COMM_WORLD rank: 0

If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--------------------------------------------------------------------------
 in "/home/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[0] #1  Foam::error::abort() in "/home/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/lib/libOpenFOAM.so"
[0] #2  Foam::List<int>::setSize(int) in "/home/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/bin/snappyHexMesh"
[0] #3  Foam::ptscotchDecomp::decomposeZeroDomains(Foam::fileName const&, Foam::List<int> const&, Foam::List<int> const&, Foam::Field<double> 
const&, Foam::List<int>&) const in "/home/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/lib/openmpi-system/libpts
cotchDecomp.so"
[0] #4  Foam::ptscotchDecomp::decompose(Foam::List<Foam::List<int> > const&, Foam::Field<Foam::Vector<double> > const&, Foam::Field<double> co
nst&) in "/home/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/lib/openmpi-system/libptscotchDecomp.so"
[0] #5  Foam::multiLevelDecomp::decompose(Foam::List<Foam::List<int> > const&, Foam::Field<Foam::Vector<double> > const&, Foam::Field<double> 
const&, Foam::List<int> const&, int, Foam::Field<int>&) in "/home/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/l
ib/libdecompositionMethods.so"
[0] #6  Foam::multiLevelDecomp::decompose(Foam::List<Foam::List<int> > const&, Foam::Field<Foam::Vector<double> > const&, Foam::Field<double> 
const&, Foam::List<int> const&, int, Foam::Field<int>&) in "/home/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/l
ib/libdecompositionMethods.so"
[0] #7  Foam::multiLevelDecomp::decompose(Foam::polyMesh const&, Foam::Field<Foam::Vector<double> > const&, Foam::Field<double> const&) in "/h
ome/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/lib/libdecompositionMethods.so"
[0] #8  Foam::meshRefinement::balance(bool, bool, Foam::Field<double> const&, Foam::decompositionMethod&, Foam::fvMeshDistribute&) in "/home/w
s/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/lib/libautoMesh.so"
[0] #9  Foam::meshRefinement::refineAndBalance(Foam::string const&, Foam::decompositionMethod&, Foam::fvMeshDistribute&, Foam::List<int> const
&, double) in "/home/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/lib/libautoMesh.so"
[0] #10  Foam::autoRefineDriver::surfaceOnlyRefine(Foam::refinementParameters const&, int) in "/home/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOA
M-2.1.0/platforms/linux64GccDPOpt/lib/libautoMesh.so"
[0] #11  Foam::autoRefineDriver::doRefine(Foam::dictionary const&, Foam::refinementParameters const&, bool, Foam::dictionary const&) in "/home
/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/lib/libautoMesh.so"
[0] #12  
[0]  in "/home/ws/nm46/openfoam/SystemOMPI-gcc46/OpenFOAM-2.1.0/platforms/linux64GccDPOpt/bin/snappyHexMesh"
[0] #13  __libc_start_main in "/lib64/libc.so.6"
[0] #14
I tested it with 256 and 208 processes. The above was for 208 processes

both crash at the same place, except that at 256 the error says that its "bad set size -41"

This is is the decomposepardict for 208: ( its a bit wrong as I still had the amounts of cores on the old system, I guess 26*8 would have been better )

Code:
numberOfSubdomains  208;
method multiLevel;


multiLevelCoeffs
{

    level0
    {
        numberOfSubdomains  16;
        method ptscotch;
    }
    level1
    {
        numberOfSubdomains 13;
        method ptscotch;
    }
}
and here for 256 :

Code:
numberOfSubdomains  256;
method multiLevel;


multiLevelCoeffs
{
    level0
    {
        numberOfSubdomains  64;
        method ptscotch;
    }
    level1
    {
        numberOfSubdomains  4;
        method ptscotch;
    }
}
I increased the amount of blocks to :
(160 64 64)
which leaves about 2550 blocks on each process for 256 processes.

I do not really know what to do about that set size error, as it happens long before the maximum of cells for the hexmesh are reached (iirc its 7 million ), so I guess it is not related to that being too small. maybe it is related to "maxLocalCells 100000;" as this now isn't reached that easily anymore (or at all I'd guess )
flami is offline   Reply With Quote

Old   July 18, 2012, 16:29
Default
  #11
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,978
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Hi flami,

Mmm... too bad . Here I was thinking that the multilevel thinga-ma-bob was the life saver...

What about the "maxGlobalCells"? Do you have it set to a very high value?

My guess is that the best next step would be to file a bug report with this information: http://www.openfoam.org/mantisbt/


I also did a quick search in Scotch's code for any hard coded values of 64, 128 or 256, but didn't find any suspicious looking ones
Trying to upgrade to a more recent Scotch library would also be a possibility, but it might be a serious pain in the neck to do, if the library interfaces changed too much with the upgrade.

Best regards,
Bruno
__________________
wyldckat is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
SnappyHexMesh OF-1.6-ext crashes on a parallel run norman1981 OpenFOAM Bugs 5 December 7, 2011 12:48
Strange Results With snappyHexMesh calebamiles OpenFOAM Running, Solving & CFD 0 August 14, 2011 16:02
[snappyHexMesh] snappyHexMesh in parallel with cyclics tonyuprm OpenFOAM Meshing & Mesh Conversion 1 June 29, 2011 10:43
[snappyHexMesh] stitchMesh and snappyHexMesh gdbaldw OpenFOAM Meshing & Mesh Conversion 0 December 23, 2009 02:09
[snappyHexMesh] SnappyHexMesh not generate mesh first time mavimo OpenFOAM Meshing & Mesh Conversion 4 August 26, 2008 07:08


All times are GMT -4. The time now is 19:34.