Hi All, I am having a proble
I am having a problem running a case of mine in parallel whilst the serial version is all fine (so far).
The mesh is imported from fluent (with the new fluent3DMeshToaFoam utility) and has an internal wall. As I said, this doesnt seem to bother much the run in serial, but after decomposing it the run invariably finishes with an MPI error message like:
Create mesh for time = 0
[oct11:07921] *** An error occurred in MPI_Recv
[oct11:07921] *** on communicator MPI_COMM_WORLD
[oct11:07921] *** MPI_ERR_TRUNCATE: message truncated
[oct11:07921] *** MPI_ERRORS_ARE_FATAL (goodbye)
 --> FOAM FATAL IO ERROR : Expected a ')' or a '}' while reading List, found on line 0 an error
 file: IOstream at line 0.
 From function Istream::readEndList(const char*)
 in file db/IOstreams/IOstreams/Istream.C at line 159.
FOAM parallel run exiting
?? in "/lib/libc.so.6"
 #3 ?? at pml_ob1_recvfrag.c:0
 #4 mca_btl_sm_component_progress in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/ lib/openmpi/mca_btl_sm.so"
 #5 mca_bml_r2_progress in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/ lib/openmpi/mca_bml_r2.so"
 #6 opal_progress in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/ lib/libopen-pal.so.0"
 #7 mca_pml_ob1_probe in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/ lib/openmpi/mca_pml_ob1.so"
 #8 MPI_Probe in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/ lib/libmpi.so.0"
 #9 Foam::IPstream::IPstream(int, int, Foam::IOstream::streamFormat, Foam::IOstream::versionNumber) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/openmpi-1.2.3/libPstream .so"
 #10 Foam::globalPoints::receivePatchPoints(Foam::HashS et<int,> >&) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so"
 #11 Foam::globalPoints::globalPoints(Foam::polyMesh const&) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so"
 #12 Foam::globalMeshData::updateMesh() in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so"
 #13 Foam::globalMeshData::globalMeshData(Foam::polyMes h const&) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so"
 #14 Foam::polyMesh::globalData() const in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so"
 #15 Foam::polyMesh::polyMesh(Foam::IOobject const&) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so"
 #16 Foam::fvMesh::fvMesh(Foam::IOobject const&) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libfiniteVolume.so"
 #17 main in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/applications/bin/linux64GccDPOpt/icoFoam"
 #18 __libc_start_main in "/lib/libc.so.6"
 #19 Foam::regIOobject::readIfModified() in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/applications/bin/linux64GccDPOpt/icoFoam"
[oct11:07971] *** Process received signal ***
[oct11:07971] Signal: Segmentation fault (11)
[oct11:07971] Signal code: (-6)
[oct11:07971] Failing at address: 0x47300001f23
[oct11:07971] [ 0] /lib/libc.so.6 [0x2aaaac61c110]
[oct11:07971] [ 1] /lib/libc.so.6(gsignal+0x3b) [0x2aaaac61c07b]
[oct11:07971] [ 2] /lib/libc.so.6 [0x2aaaac61c110]
[oct11:07971] [ 3] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/openmpi/mca_pml_ob1.so [0x2aaab26b8c17]
[oct11:07971] [ 4] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x1db) [0x2aaab2cd07cb]
[oct11:07971] [ 5] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2a) [0x2aaab28c426a]
[oct11:07971] [ 6] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/libopen-pal.so.0(opal_progress+0x4a) [0x2aaaad93495a]
[oct11:07971] [ 7] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/openmpi/mca_pml_ob1.so(mca_pml_ob1_probe+0x3c5) [0x2aaab26b61a5]
[oct11:07971] [ 8] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/libmpi.so.0(MPI_Probe+0xf6) [0x2aaaad28fda6]
[oct11:07971] [ 9] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/openmpi-1.2.3/libPstream. so(_ZN4Foam8IPstreamC1EiiNS_8IOstream12streamForma tENS1_13versionNumberE+0xee) [0x2aaaac82f24e]
[oct11:07971]  /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZN4Foam12 globalPoints18receivePatchPointsERNS_7HashSetIiNS_ 4HashIiEEEE+0x22c) [0x2aaaababc50c]
[oct11:07971]  /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZN4Foam12 globalPointsC1ERKNS_8polyMeshE+0x24f) [0x2aaaababccaf]
[oct11:07971]  /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZN4Foam14 globalMeshData10updateMeshEv+0x110) [0x2aaaabaae890]
[oct11:07971]  /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZN4Foam14 globalMeshDataC1ERKNS_8polyMeshE+0xe4) [0x2aaaabaaff64]
[oct11:07971]  /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam8 polyMesh10globalDataEv+0x55) [0x2aaaabad07f5]
[oct11:07971]  /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZN4Foam8p olyMeshC2ERKNS_8IOobjectE+0x1c02) [0x2aaaabad6f12]
[oct11:07971]  /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libfiniteVolume.so(_ZN4Fo am6fvMeshC1ERKNS_8IOobjectE+0x19) [0x2aaaaae3cae9]
[oct11:07971]  /home/radu/OpenFOAM/OpenFOAM-1.4.1/applications/bin/linux64GccDPOpt/icoFoam [0x412e07]
[oct11:07971]  /lib/libc.so.6(__libc_start_main+0xda) [0x2aaaac6094ca]
[oct11:07971]  /home/radu/OpenFOAM/OpenFOAM-1.4.1/applications/bin/linux64GccDPOpt/icoFoam(_ZN4 Foam11regIOobject14readIfModifiedEv+0x1a9) [0x412979]
[oct11:07971] *** End of error message ***
mpirun noticed that job rank 0 with PID 7969 on node oct11 exited on signal 15 (Terminated).
3 additional processes aborted (not shown
checkMesh does say:
internal faces: 11788349
boundary patches: 4
point zones: 0
face zones: 0
cell zones: 3
Number of cells of each type:
tet wedges: 0
Boundary definition OK.
Point usage OK.
Upper triangular ordering OK.
Topological cell zip-up check OK.
Face vertices OK.
Number of identical duplicate faces (baffle faces): 77004
Face-face connectivity OK.
Number of regions: 1 (OK).
Checking patch topology for multiply connected surfaces ...
Patch Faces Points Surface
pared 182407 182695 ok (not multiply connected)
inflow_top_lid 1836 1963 ok (not multiply connected)
outflow_top_lid 2587 2750 ok (not multiply connected)
pared_interior 154008 77742 multiply connected surface (shared edge)
<<Writing 77718 conflicting points to set nonManifoldPoints
Domain bounding box: (-0.04 -0.04 -1.42109e-17) (0.04 0.04 0.08)
Boundary openness (-2.89631e-16 -8.26677e-16 -8.0705e-16) OK.
Max cell openness = 8.55581e-16 OK.
Max aspect ratio = 323.357 OK.
Minumum face area = 8.39926e-10. Maximum face area = 8.16213e-06. Face area magnitudes OK.
Min volume = 6.33172e-14. Max volume = 1.06746e-08. Total volume = 0.000402107. Cell volumes OK.
Mesh non-orthogonality Max: 32.6604 average: 5.2825
Non-orthogonality check OK.
Face pyramids OK.
Max skewness = 0.594768 OK.
Min/max edge length = 2.04497e-05 0.00509539 OK.
All angles in faces OK.
Face flatness (1 = flat, 0 = butterfly) : average = 1 min = 0.999999
All face flatness OK.
Is that multiply connected surface (the internal wall) that is causing the trouble? Or should I look elsewhere? I am saying this because I did the import with the old "fluentMeshToFoam" and used the procedure described by Bernhard for "mesh with internal walls" and, having two patches instead of one did get rid of these "multiply connected surfaces" label, but the parallel run failed again.
Sorry for the long post...
How did you decompose the mesh
How did you decompose the mesh?
Are you using the same OF version for decomposing and running?
Try checkMesh in parallel, but I guess it returns the same error...
Running decomposePar with the
Running decomposePar with the "simple" option as the 3D mesh is just a 2D one replicated in the third direction a certain number of times.
And yes I used the same OF1.1.4 for decomposing and running, and previously did an ./Allwmake in ~/OpenFOAM/OpenFOAM-1.4.1/applications/utilities/
parallelProcessing, just in case.
However, in the meantime I wiped a part of the mesh of one side of that internal wall patch, so now it became a normal boundary patch of type wall, and did the whole process of importing the mesh etc, etc...and IT WORKED!...both serial and parallel.
So my guess is that the multiply connected face, or the two faces of zero "depth" that were created following Bernhardīs procedure make the difference in some stage of the parallel run process.
Did someone encounter the same problem, or am I just rubbish/sluggish somewhere in the way?
Can you check that both sides
Can you check that both sides of all processor patches have the same number of points? This is a requirement for valid meshes.
Just run checkMesh on all the domains and look at the patch statistics for procBoundary_xxToyy v.s. procBoundary_yyToxx.
Hi Mattijs, I did that check
I did that check and forgot to mention it in the post. And yes, the stats say that they do have the same number of points and faces either way of the processor boundaries.
Furthermore,the patch with multiply connected faces lies inside one of the processor domain, some cells away from the processor boundary.
LASTMINUTE: ..I changed the decomposition method from simple to metis with the same weight on all processes and to my surprise it works fine..as in it runs with no MPI failure.
So I guess that I will stick with this to get some results. In the meantime will try to understand what went wrong before (truth is that I simply donīt think I will, cause I donīt see anything wrong, for what I know)
Can you post the case or send
Can you post the case or send it to me? (m.janssens)
How can I upload a case? Never
How can I upload a case? Never done that....mesh file from Gambit is large (~800M)...
try to load the 0 and constant
try to load the 0 and constant dirs...system should be like the one of e.g. icoFoam/cavity and run with icoFoam...
Well... I now know how to do i
Well... I now know how to do it,but of course it complains about the size...and it fails...
There's a 50K limit on this fo
There's a 50K limit on this forum.
Have any smaller case that has the problem?
Or cut out the bits that give problems perhaps?
- set your startTime to latestTime
- for all domains pick up the cells using any point on the boundary:
setSet . processorXXX
faceSet f0 new boundaryToFace
pointSet p0 new faceToPoint f0 all
cellSet c0 new pointToCell p0 any
- subset the c0 part of the mesh:
subsetMesh <root> <case> c0
- pack up the subsetted meshes (there will be new time directories with a polyMesh inside)
Did what you said, but the tgz
Did what you said, but the tgz of one of them polyMesh directories "weights" still some 9M. Will do a smaller case and check.
Thank you for your effort and I will let you know as soon as I get something. Probably monday...
Well, well..I did a smaller ca
Well, well..I did a smaller case, but then everything was fine so no luck in catching the fault. However, on the big case (the one with metis decomp.), the run failed at some time when a dump of data had to be done..and gave me some errors like:
[oct11:08555] *** Process received signal ***
[oct11:08555] Signal: Bus error (7)
[oct11:08555] Signal code: (2)
[oct11:08555] Failing at address: 0x2aaaab04be10
[oct11:08555] [ 0] /lib/libc.so.6 [0x2aaaac98b110]
[oct11:08555] [ 1] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libfiniteVolume.so(_ZNK4F oam20coupledFvsPatchFieldIdE5writeERNS_7OstreamE+0 ) [0x2aaaab04be10]
[oct11:08555] [ 2] /home/radu/OpenFOAM/radu-1.4.1/applications/bin/linux64GccDPOpt/porosoSteadyFoam (_ZNK4Foam14GeometricFieldIdNS_13fvsPatchFieldENS_ 11surfaceMeshEE22GeometricBoun daryField10writeEntryERKNS_4wordERNS_7OstreamE+0x1 2b) [0x42573b]
[oct11:08555] [ 3] /home/radu/OpenFOAM/radu-1.4.1/applications/bin/linux64GccDPOpt/porosoSteadyFoam (_ZN4FoamlsIdNS_13fvsPatchFieldENS_11surfaceMeshEE ERNS_7OstreamES4_RKNS_14Geomet ricFieldIT_T0_T1_EE+0x1d4) [0x43a344]
[oct11:08555] [ 4] /home/radu/OpenFOAM/radu-1.4.1/applications/bin/linux64GccDPOpt/porosoSteadyFoam (_ZNK4Foam14GeometricFieldIdNS_13fvsPatchFieldENS_ 11surfaceMeshEE9writeDataERNS_ 7OstreamE+0xf) [0x43a3ef]
[oct11:08555] [ 5] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam1 1regIOobject11writeObjectENS_8IOstream12streamForm atENS1_13versionNumberENS1_15c ompressionTypeE+0x263) [0x2aaaabd69e03]
[oct11:08555] [ 6] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam1 4objectRegistry11writeObjectENS_8IOstream12streamF ormatENS1_13versionNumberENS1_ 15compressionTypeE+0x93) [0x2aaaabd6dc63]
[oct11:08555] [ 7] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam1 4objectRegistry11writeObjectENS_8IOstream12streamF ormatENS1_13versionNumberENS1_ 15compressionTypeE+0x93) [0x2aaaabd6dc63]
[oct11:08555] [ 8] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam4 Time11writeObjectENS_8IOstream12streamFormatENS1_1 3versionNumberENS1_15compressi onTypeE+0x3ab) [0x2aaaabd7ffdb]
[oct11:08555] [ 9] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam1 1regIOobject5writeEv+0x4f) [0x2aaaabd69b7f]
[oct11:08555]  /home/radu/OpenFOAM/radu-1.4.1/applications/bin/linux64GccDPOpt/porosoSteadyFoam [0x416b0e]
[oct11:08555]  /lib/libc.so.6(__libc_start_main+0xda) [0x2aaaac9784ca]
[oct11:08555]  /home/radu/OpenFOAM/radu-1.4.1/applications/bin/linux64GccDPOpt/porosoSteadyFoam (__gxx_personality_v0+0xda) [0x412a4a]
Rings a bell to anyone?
So...I guess that somethingīs wrong in the cluster setup, right? Have to contact the Admin.
Something seems to be still wr
Something seems to be still wrong on your processor patches ...
Try running with the following environment variables set:
# Initialise blocks of memory to NaN
# Abort instead of exit
# Exit if NaN encountered
and possibly under valgrind.
Mattijs, Could not advance an
Could not advance anything into the problem so far, cause I had so many admin. things to do, but I will and will let you know.
Hi all, Finally I gave up se
Finally I gave up searching for the problem in any of decomposePar and friends. And, surprisingly enough, now it seems to work fine. That is after I removed some exports in my bashrc...probably something tampered with my OF install.
Thatīs that then.
I believe that increasing the
I believe that increasing the environment variable:
|All times are GMT -4. The time now is 07:28.|