CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > OpenFOAM Running, Solving & CFD

Finite area method (fac::div) fails in parallel

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
Old   October 24, 2012, 08:01
Default Finite area method (fac::div) fails in parallel
  #1
New Member
 
Join Date: Oct 2012
Posts: 17
Rep Power: 3
cuba is on a distinguished road
Hi everyone,
I am a new foam user and working on a modified version of pimpleDyMFoam utilized with a k-omega model in 1.6-ext.
The problem is the code works fine for serial runs but stops working in parallel when it comes to line-3:

Code:
1- var0.boundaryField()[PAtchID] = U.boundaryField()[PAtchID];
2- var1.internalField() = vsm.mapToSurface(var0.boundaryField());
3- areaScalarField var2 = - fac::div(var1);
The above given lines depend on a physical condition. If the condition is false then the program does not evaluate the above given lines and works also fine in parallel.

The error message in parallel (where
export FOAM_ABORT=1
mpirun --mca orte_base_help_aggregate 0 -d -np 4 pimpleDyMFoam -parallel > log ) is given below:


Code:
[n-62-24-13:13854] *** An error occurred in MPI_Recv
[n-62-24-13:13854] *** on communicator MPI_COMM_WORLD
[n-62-24-13:13854] *** MPI_ERR_TRUNCATE: message truncated
[n-62-24-13:13854] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[n-62-24-13:13854] sess_dir_finalize: proc session dir not empty - leaving
[n-62-24-13:13853] sess_dir_finalize: proc session dir not empty - leaving
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 13854 on
node n-62-24-13 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[n-62-24-13:13853] sess_dir_finalize: job session dir not empty - leaving
[n-62-24-13:13853] sess_dir_finalize: proc session dir not empty - leaving
orterun: exiting with status 15

In the meantime, I have tried different div schemes (in faSchemes file) changing the line

Code:
default		Gauss linear;
to other Gauss interpolations, yet, I could not get rid of the error.

What might be the problem? How can I debug more and find the error?
Any comments and advices ?
Thanks in advance
cuba is offline   Reply With Quote

Old   October 25, 2012, 11:25
Default
  #2
Senior Member
 
kmooney's Avatar
 
Kyle Mooney
Join Date: Jul 2009
Location: Amherst, MA USA - San Francisco, CA USA
Posts: 225
Rep Power: 8
kmooney is on a distinguished road
Have you tried compiling and running in debug? I've had pretty good luck with mpirunDebug when it comes to parallel debugging.

I use the finite area library in parallel but unfortunately I do not use that operator.
kmooney is offline   Reply With Quote

Old   October 25, 2012, 11:32
Default
  #3
New Member
 
Join Date: Oct 2012
Posts: 17
Rep Power: 3
cuba is on a distinguished road
Thanks for the reply
I thought that adding "-d" in the mpirun would be enough to debug, but apparently did help at all.
Should I compile the program first in a way that it can be debugged in parallel ? and how can I do that?
cuba is offline   Reply With Quote

Old   October 25, 2012, 11:46
Default
  #4
Senior Member
 
kmooney's Avatar
 
Kyle Mooney
Join Date: Jul 2009
Location: Amherst, MA USA - San Francisco, CA USA
Posts: 225
Rep Power: 8
kmooney is on a distinguished road
Quote:
Originally Posted by cuba View Post
Thanks for the reply
I thought that adding "-d" in the mpirun would be enough to debug, but apparently did help at all.
Should I compile the program first in a way that it can be debugged in parallel ? and how can I do that?

Yep, recompile with the debug compiler option set. Set this option by running this in the shell:
Code:
export WM_COMPILE_OPTION=Debug
then clean and wmake to recompile. You should see a flag in the compiler output jargon that looks like -FULLDEBUG or something like that as you recompile.

You might need to install mpirunDebug from your linux software repository. I don't think it comes with the standard MPI package.
kmooney is offline   Reply With Quote

Old   October 25, 2012, 11:58
Default
  #5
New Member
 
Join Date: Oct 2012
Posts: 17
Rep Power: 3
cuba is on a distinguished road
Thanks for the replies, I will be working on that
cuba is offline   Reply With Quote

Old   October 29, 2012, 10:09
Default
  #6
New Member
 
Join Date: Oct 2012
Posts: 17
Rep Power: 3
cuba is on a distinguished road
I have re-compiled my code entering the "export WM_COMPILE_OPTION=Debug" to the terminal first. It gave me the below given message.

--------------------------
g++ -m64 -Dlinux64 -DWM_DP -Wall -Wextra -Wno-unused-parameter -Wold-style-cast -Wnon-virtual-dtor -O0 -fdefault-inline -ggdb3 -DFULLDEBUG -DNoRepository -ftemplate-depth-40 -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/dynamicMesh/dynamicFvMesh/lnInclude -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/dynamicMesh/dynamicMesh/lnInclude -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/meshTools/lnInclude -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/turbulenceModels -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/transportModels -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/transportModels/incompressible/singlePhaseTransportModel -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/finiteArea/lnInclude -DFACE_DECOMP -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/tetDecompositionFiniteElement/lnInclude -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/tetDecompositionMotionSolver/lnInclude -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/finiteVolume/lnInclude -IlnInclude -I. -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/OpenFOAM/lnInclude -I/appl/OpenFOAM/OpenFOAM-1.6-ext/src/OSspecific/POSIX/lnInclude -fPIC -Xlinker --add-needed Make/linux64GccDPOpt/pimpleDyMFoam.o -L/appl/OpenFOAM/OpenFOAM-1.6-ext/lib/linux64GccDPOpt \
-ldynamicFvMesh -ltopoChangerFvMesh -ldynamicMesh -lmeshTools -lincompressibleTransportModels -lincompressibleTurbulenceModel -lincompressibleRASModels -lincompressibleLESModels -lfiniteArea -lfiniteVolume -llduSolvers -lOpenFOAM -liberty -ldl -ggdb3 -DFULLDEBUG -lm -o /zhome/83/d/74221/OpenFOAM/cuba-1.6-ext/applications/bin/linux64GccDPOpt/pimpleDyMFoam
-------------------------------

later I decomposed my domain and entered

mpirunDebug -np 4 pimpleDyMFoam -parallel

later I have selected

Choose running method: 0)normal 1)gdb+xterm 2)gdb 3)log 4)log+xterm 5)xterm+valgrind 6)nemiver: 1
Run all processes local or distributed? 1)local 2)remote: 2


It produced gdbCommands, mpirun.schema, processor0.sh, processor1.sh, processor2.sh and processor3.sh files.

How can I start the runs per each processor?

If I just type processor0.sh to the terminal, I get the following in the processor0.log file and terminal.


-----------------------
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /zhome/83/d/74221/OpenFOAM/cuba-1.6-ext/applications/bin/linux64GccDPOpt/pimpleDyMFoam...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Detaching after fork from child process 22079.
[New Thread 0x7fffee345700 (LWP 22083)]
[Thread 0x7fffee345700 (LWP 22083) exited]

--> FOAM FATAL ERROR:
bool Pstream::init(int& argc, char**& argv) : attempt to run parallel on 1 processor

From function Pstream::init(int& argc, char**& argv)
in file Pstream.C at line 74.

FOAM aborting

Program received signal SIGABRT, Aborted.
0x00000030f8232885 in raise () from /lib64/libc.so.6
#0 0x00000030f8232885 in raise () from /lib64/libc.so.6
#1 0x00000030f8234065 in abort () from /lib64/libc.so.6
#2 0x00007ffff448d28b in Foam::error::abort() () from /appl/OpenFOAM/OpenFOAM-1.6-ext/lib/linux64GccDPOpt/libOpenFOAM.so
#3 0x00007ffff3b781da in Foam::Pstream::init(int&, char**&) () from /appl/OpenFOAM/OpenFOAM-1.6-ext/lib/linux64GccDPOpt/openmpi-system/libPstream.so
#4 0x00007ffff449a655 in Foam::argList::argList(int&, char**&, bool, bool) () from /appl/OpenFOAM/OpenFOAM-1.6-ext/lib/linux64GccDPOpt/libOpenFOAM.so
#5 0x00000000004252b3 in main ()
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6_2.5.x86_64 libgcc-4.4.5-6.el6.x86_64 libibverbs-1.1.4-2.el6.x86_64 librdmacm-1.0.10-2.el6.x86_64 libstdc++-4.4.5-6.el6.x86_64 zlib-1.2.3-25.el6.x86_64
[?1034h(gdb)
--------------------------------



Could anyone give me more information on how to use mpirunDebug?
cuba is offline   Reply With Quote

Old   November 1, 2012, 04:40
Default
  #7
New Member
 
Join Date: Oct 2012
Posts: 17
Rep Power: 3
cuba is on a distinguished road
Hi everyone,

I could not make a progress on mpirun debugging yet but I have another question.

How can I make the processors synchronized before evaluating the line ?

Code:
3- areaScalarField var2 = - fac::div(var1);
cuba is offline   Reply With Quote

Old   November 8, 2012, 10:14
Default
  #8
New Member
 
Join Date: Oct 2012
Posts: 17
Rep Power: 3
cuba is on a distinguished road
Anyone knows how to make the processors wait for each other before evaluating some piece of code as above ?
cuba is offline   Reply With Quote

Old   November 10, 2012, 04:36
Default
  #9
Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 7,088
Blog Entries: 32
Rep Power: 70
wyldckat is a jewel in the roughwyldckat is a jewel in the roughwyldckat is a jewel in the rough
Greetings Cuba,

In reply to your last post and PM you sent me:
  • An old thread on this subject: http://www.cfd-online.com/Forums/ope...-openfoam.html
  • Simpler code can be found in "applications/test/parallel/Test-parallel.C", more specifically the part that starts at:
    Code:
    Perr<< "\nStarting transfers\n" << endl;
    In that example, the slaves can be stalled with the block of code that starts with:
    Code:
    Perr<< "slave receiving from master "
    While the master can hold them back until the "for" loop that has this line is executed:
    Code:
    Perr << "master sending to slave " << slave << endl;
    The only problem with this is that the master will be the last one back on the job, since it has to first communicate with all other slaves, telling them to get back to work ...
Best regards,
Bruno
wyldckat is offline   Reply With Quote

Old   November 15, 2012, 07:03
Default
  #10
New Member
 
Join Date: Oct 2012
Posts: 17
Rep Power: 3
cuba is on a distinguished road
Thanks for the reply wyldckat
It did help me to have a better understanding about Pstream commands.

I finally solved (at least I got around it) my problem.
My problem was briefly ... one of subdomains was entering the routine while others were not as the condition defined to enter the routine was not true for them.

In the mean time,
to find the maximum value of a variable defined on a patch over all the subdomains, I have tried both gMax(var) and max(reduce(var, maxOp<scalarField>)) commands. But the one found by the gMax was not the maximum and was actually smaller than the one found by the reduce command. Anyone noticed such a thing before?

Thanks again for the replies
cuba is offline   Reply With Quote

Old   November 20, 2012, 07:03
Default
  #11
Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 7,088
Blog Entries: 32
Rep Power: 70
wyldckat is a jewel in the roughwyldckat is a jewel in the roughwyldckat is a jewel in the rough
Hi Cuba,

Quote:
Originally Posted by cuba View Post
In the mean time,
to find the maximum value of a variable defined on a patch over all the subdomains, I have tried both gMax(var) and max(reduce(var, maxOp<scalarField>)) commands. But the one found by the gMax was not the maximum and was actually smaller than the one found by the reduce command. Anyone noticed such a thing before?
I haven't had the time to check on this yet... it sort-of looks like a bug, but without a test case that replicates the issue, it's hard to do any checking myself.
By the way, the maximum reported value, which one was actually correct?
Because it's also possible that one of them was actually picking up an outdated value!
For example, one of them might be picking up a value that was communicated between processes at the beginning of the iterations, but the maximum value was calculated only after those iterations, and since it was located between processes, the value is no longer up-to-date...

Best regards,
Bruno
wyldckat is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Lattice Boltzmann method vs Finite Element Method and Finite Volume Method solemnpriest Main CFD Forum 3 August 12, 2013 11:00
VOF Interfacial Area - 2D/3D Reconstruction Method Greg Perkins Main CFD Forum 2 September 10, 2012 04:05
Finite Volume Method cfd seeker Main CFD Forum 3 September 8, 2011 04:36
CFX Solver Memory Error mike CFX 1 March 19, 2008 08:22
finite volume method co2 FLUENT 0 March 1, 2004 12:24


All times are GMT -4. The time now is 20:28.