CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   OpenFOAM (http://www.cfd-online.com/Forums/openfoam/)
-   -   Problems with mpirun (http://www.cfd-online.com/Forums/openfoam/60973-problems-mpirun.html)

duderino April 7, 2005 04:51

Hi I am running localy on
 
Hi

I am running localy on 2 processor machine.With other mpirun works fine. But now I get the following error message I don't know what (semop lock failed is):


43 -> mpirun -np 2 Start_Par.sh
/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.0.2 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : interFoam . testnozzleinter3 -parallel
Date : Apr 07 2005
Time : 10:32:13
Host : cci00150
PID : 3134
Date : Apr 07 2005
Time : 10:32:13
Host : cci00150
PID : 3135
[1] Root : /usr2/tmp/ccgrueni/OpenFOAM/kloster-1.0.2/run/tutorials/interFoam
[0] Root : /usr2/tmp/ccgrueni/OpenFOAM/kloster-1.0.2/run/tutorials/interFoam
[0] Case : testnozzleinter3
[0] Nprocs : 2
[0] Slaves :
1
(
cci00150.3135
)

Create database

[1] Case : testnozzleinter3
[1] Nprocs : 2
Create mesh

Selecting movingFvMesh staticFvMesh

Reading environmentalProperties
Reading field pd

Reading field gamma

Reading field U

Reading/calculating face flux field phi

Reading transportProperties

Selecting incompressible transport model Newtonian
Selecting incompressible transport model Newtonian
Calculating field g.h

ICCG: Solving for pcorr, Initial residual = 1, Final residual = 9.88347e-11, No Iterations 292
time step continuity errors : sum local = 1.26951e-15, global = -5.15336e-18, cumulative = -5.15336e-18
Building global boundary list

Starting time loop

1 - MPI_RECV : Message truncated
[1] Aborting program !
[1] Aborting program!
OOPS: semop lock failed
425990



Any help is appreciated!!

mattijs April 7, 2005 05:52

Hi Duderino, semops are use
 
Hi Duderino,

semops are used by mpi when you run on a shared memory machine. Things to check:
- are you picking up the correct (i.e. OpenFOAM) mpi libraries or are you using the system ones (use ldd to find out)
- You said with other it runs fine. Other machines or other mpi version?

It seems to be a system problem. We run on shared memory machines over here without problems.

duderino April 7, 2005 07:15

Hi Mattijs What I mean by o
 
Hi Mattijs

What I mean by other is actually another case (on the same machine with the same mpi version).

duderino April 7, 2005 07:22

So it seems to be a problem wi
 
So it seems to be a problem with the case but the same case runs on single processor??

mattijs April 7, 2005 07:26

Is this fully standard interFo
 
Is this fully standard interFoam? Does it work with fully standard interFoam? What is this "Building global boundary list"?

I attach a simple script which starts up parallel jobs in different windows. Quite nice to debug parallel cases. Call like mpirun so
lamrun -np 2 `which interFoam` root case -parallel
http://www.cfd-online.com/OpenFOAM_D...hment_icon.gif lamrun

duderino April 7, 2005 09:00

Hi Mattijs We couldn't get
 
Hi Mattijs

We couldn't get your script to run because I am using mpich instead off lam (which somehow doesn't work on our machine).

It actually was not fully standard interFoam. I modified it to calculate massflux over all boundaries how it is explained in:

http://www.cfd-online.com/cgi-bin/Op...=1473#POST1473
I used the one which is for OpenFoam1.0.2: By Jarrod Sinclair on Wednesday, March 09, 2005 - 03:02 am


And this causes the problem somehow, because when I set it back to fully standard interFoam it works.
But I am still wondering why the modified interFoam is working with the other case?

Thank you for your help!

mattijs April 7, 2005 09:11

Maybe because some of the doma
 
Maybe because some of the domains do not have all of the original patches and the other case does (or something similar). Check your processor domains for which patches they have and compare to the case that runs.

Just out of interest: what machine are you on and do you know why lam doesn't work?

duderino April 7, 2005 10:41

The machine: IA32 linux The
 
The machine: IA32 linux
The problem with lam: no idea, somehow lam seems to start two seperate processes but they are not comunicatiting with each other.

Sorry for not beeing of great help!

mattijs April 7, 2005 14:34

lamboot starts up one process
 
lamboot starts up one process ("lamd") per processor if I remember correctly so you will see two processes. Don't see why they should not communicate to one another. We run shared memory machines over here without problems.

hartinger November 9, 2005 13:15

Hey, i've got a low-tech ap
 
Hey,

i've got a low-tech approach for that. A python script which reads in a log file and splits it up according to processor number(you have to use 'Pout') and a file for the rest without processor number.
http://www.cfd-online.com/OpenFOAM_D...hment_icon.gif splitParallel.py

markus

melanie June 13, 2006 04:23

Hello, as I had problems ru
 
Hello,

as I had problems running in parallel, I wanted to use the exec lamrun posted here by Mattijs on the solver oodles and 2 processors. The output is however the opening of 4 xterm with the following error message in each one:

gdbCommands:1: Error in sourced command file:
No executable file specified.
Use the "file" or "exec-file" command.
(gdb)


Now, as it does not work, I can explain my error; the decomposition using metis works fine, and the mpirun commands stops while reading the mesh

Create mesh, no clear-out for time = 0.0992

[1]
[1]
[1] --> FOAM FATAL ERROR : Cannot find patch edge with vertices (6 242) on patch procBoundary1to0
Can only find edges
3
(
(6 514)
(2 6)
(6 7)
)
connected to first vertex
[1]
[1] From function processorPolyPatch::updateMesh()
[1] in file meshes/polyMesh/polyPatches/derivedPolyPatches/processorPolyPatch/processorPolyP atch.C at line 351.
[1]
FOAM parallel run aborting
[1]
[0]
[0]
[0] --> FOAM FATAL ERROR : Cannot find patch edge with vertices (257 8) on patch procBoundary0to1
Can only find edges
3
(
(257 523)
(256 257)
(257 258)
)
connected to first vertex
[0]
[0] From function processorPolyPatch::updateMesh()
[0] in file meshes/polyMesh/polyPatches/derivedPolyPatches/processorPolyPatch/processorPolyP atch.C at line 351.
[0]
FOAM parallel run aborting
[0]
[1] Foam::error::printStack(Foam:http://www.cfd-online.com/OpenFOAM_D...part/proud.gifstream&)
[1] Foam::error::abort()
[1] Foam::processorPolyPatch::updateMesh()
[1] Foam::polyBoundaryMesh::updateMesh()
[1] Foam::polyMesh::polyMesh(Foam::IOobject const&)
[1] Foam::fvMesh::fvMesh(Foam::IOobject const&)
[1] Foam::regIOobject::write(Foam::IOstream::streamFor mat, Foam::IOstream::versionNumber, Foam::IOstream::compressionType) const
[1] __libc_start_main
[1] __gxx_personality_v0
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 32447 failed on node n0 (204.104.5.156) with exit status 1.
-----------------------------------------------------------------------------

I never had this before...
Anyone could give a hint please ?
Thanks !
melanie

mattijs June 13, 2006 05:50

Do you have cyclics? Try and m
 
Do you have cyclics? Try and make them non-cyclic or change the decomposition so all cyclics are within a single domain.

Or just change the FatalErrorIn..abort(FatalError) into a WarningIn..endl;

(This is to do with there being no one-to-one mapping between coupled edges if part of cyclic is included in a processor patch)

melanie June 13, 2006 08:28

yes I have cyclics that I must
 
yes I have cyclics that I must keep cyclic.
For the decomposition, I guess I have to do it by hand (manual) ? because the cyclics are parallel domains with a small gap in-between.
About the third solution, do you think it would affect the results if some points are not linked to their corresponding point on the other boundary ? should I recompile only the concerned function or more ?

Thanks !

mattijs June 13, 2006 12:41

- manual decomposition indeed.
 
- manual decomposition indeed. Reads a labelIOList (which is format like e.g. faceProcAddressing in a decomposed case)

- the edge inconsistency might affect interpolation to edges which is used in postprocessing and not in oodles. No need to recompile anything but that code.

melanie June 14, 2006 02:16

ok, I would then try the secon
 
ok, I would then try the second; I have already made the change but I cannot compile with wmake, as there is no Make folder...

melanie June 19, 2006 02:30

Hi, could someone just tell
 
Hi,

could someone just tell me how to compile this as wmake is not suited ?

Thanks !
melanie

maka August 14, 2009 12:43

I have the same problem but it only shows when using metis. Using simple as decomposition method works even if the cyclic patches are divided among the processors. This is V 1.3. Here is the error I get when I use the solver on case.

-----------------------------------------------------------

[3]
[3]
[3] --> FOAM FATAL ERROR : Cannot find patch edge with vertices (45 81) on patch procBoundary3to1
Can only find edges
3
(
(45 46)
(0 45)
(45 92)
)
connected to first vertex
[3]
[3] From function processorPolyPatch::updateMesh()
[3] in file meshes/polyMesh/polyPatches/derivedPolyPatches/processorPolyPatch/processorPolyPatch.C at line
351.
[3]
FOAM parallel run aborting
[3]
[3] Foam::error::printStack(Foam::Ostream&)
[3] Foam::error::abort()
[3] Foam::processorPolyPatch::updateMesh()
[3] Foam::polyBoundaryMesh::updateMesh()
[3] Foam::polyMesh::polyMesh(Foam::IOobject const&)
[3] Foam::fvMesh::fvMesh(Foam::IOobject const&)
[3] rotatingChannelOodles.08 [0x415efd]
[3] __libc_start_main
[3] __gxx_personality_v0
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 32670 failed on node n0 (172.20.253.241) with exit status 1.
-----------------------------------------------------------------------------


decomposePar log for cyclic in x and z of a channel:

Processor 0
Number of cells = 16082
Number of faces shared with processor 1 = 1696
Number of faces shared with processor 3 = 149
Number of faces shared with processor 2 = 889
Number of boundary faces = 1682

Processor 1
Number of cells = 15541
Number of faces shared with processor 3 = 960
Number of faces shared with processor 0 = 1696
Number of faces shared with processor 2 = 302
Number of boundary faces = 1348

Processor 2
Number of cells = 16123
Number of faces shared with processor 3 = 1894
Number of faces shared with processor 1 = 302
Number of faces shared with processor 0 = 889
Number of boundary faces = 1253

Processor 3
Number of cells = 16254
Number of faces shared with processor 2 = 1894
Number of faces shared with processor 1 = 960
Number of faces shared with processor 0 = 149
Number of boundary faces = 1541

Best regards,
Maka.

mabinty February 5, 2010 14:00

Dear all,

I m observing similar problems when using cyclic BC in a parallel run. The case, ran with "chtMultiRegionFoam", consists of a quarter of a cylindrical domain. It runs in single as well as in parallel when "metis" is applied as decomposition method. In case of "scotch" as decomposition method, the decomposePar outputs the following:

Code:

Decomposing mesh air

Create time

Time = 0.001
Create mesh

Calculating distribution of cells
Selecting decompositionMethod scotch
ERROR: graphCheck: loops not allowed

and no decomposition of the region "air" is done. Any idea whats the reason for that?? I greatly appreciate your comments,
Aram


All times are GMT -4. The time now is 20:41.