CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM (https://www.cfd-online.com/Forums/openfoam/)
-   -   Running in parallel (https://www.cfd-online.com/Forums/openfoam/60686-running-parallel.html)

Rasmus Gjesing (Gjesing) February 24, 2005 10:41

Hi, I am just testing the p
 
Hi,

I am just testing the parallel-feature on to nodes before setting up a hopefully bigger cluster.

But I ran into problems...

I use ssh with LAM and that works fine. tping returns correctly from my local and remote node.

But when I make a mpirun,
mpirun -np 2 icoFoam $FOAM_RUN/tutorials/icoFoam cavityPar -parallel < /dev/null >& log &
as written in the guide, it returns the following...

P.S. I have decomposed the case and the processor directories and files are present?

Any suggestions?

/Rasmus

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.0.2 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : icoFoam /home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam cavityPar -parallel
/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.0.2 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : icoFoam /home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam cavityPar -parallel
Date : Feb 24 2005
Time : 16:28:24
Host : serie020-lease-041.intern.ipl
PID : 7119
Date : Feb 24 2005
Time : 16:28:24
Host : foamcalculator.intern.ipl
PID : 7331
[1] Root : /home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam
[1] Case : cavityPar
[1] Nprocs : 2
[0] Root : /home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam
[0] Case : cavityPar
[0] Nprocs : 2
[0] Slaves :
1
(
foamcalculator.intern.ipl.7331
)

Create database


--> FOAM FATAL ERROR : icoFoam: Cannot open case directory "/home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam/cavityPar/processor1"


Function: argList::checkRootCase() const
in file: global/argList/argList.C at line: 511.Create mesh for time = 0


FOAM parallel run exiting

Fabian Peng Kärrholm (Kärrholm) February 25, 2005 03:49

This is from someone who just
 
This is from someone who just started his first paralell case a few minutes ago, but from the error message, it sounds like you haven't decomposed your case properly.
Have you checked that you have selected the same number of subdomains, processors etc? And that you have a polyMesh and a 0 directory in the processor0 and processor1 directories? And that they are all readable?

Rasmus Gjesing (Gjesing) February 25, 2005 04:09

I am pretty sure my decomposi
 
I am pretty sure my decomposition is ok, since I can run the decomposed case on one local processor. I think my problem is the connection to the remote eventough I can tping it and boot lam succesfully.

Are you also using ssh or rsh?

/Rasmus

P.S. I am using Redhat 9 if that adds extra knowledge?!

Mattijs Janssens (Mattijs) February 25, 2005 04:14

Hi Rasmus, Can your remote
 
Hi Rasmus,

Can your remote computer read your files? Is nfs working ok or maybe the protection bits are causing problems?

Mattijs

Rasmus Gjesing (Gjesing) February 25, 2005 05:34

Hi Mattijs, I think you on
 
Hi Mattijs,

I think you on the track of something, because my nfs was disabled, but now enabled, however still problems...

I can run the decomposed case on my local node, but the remote causes problems. I have just tried running the case ONLY on the remote and now this is the error I get.


--> FOAM FATAL IO ERROR : Istream not OK for reading dictionary

file: /home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam/cavityPar/system/decomposeParDict at line 1.

Function: dictionary::read(Istream&, const word&)
in file: db/dictionary/dictionaryIO.C at line: 44.

FOAM parallel run exiting

So, it is the permissions that are teasing me.

How can I fix this?

Regards,
Rasmus


BTW. my decomposeParDict is like this...

numberOfSubdomains 2;

method simple;

simpleCoeffs
{
n (2 1 1);
delta 0.001;
}

hierarchicalCoeffs
{
n (2 1 1);
delta 0.001;
order xyz;
}

metisCoeffs
{
processorWeights
(
1
1
);
}

manualCoeffs
{
dataFile "";
}

distributed no;

roots
(
);

Rasmus Gjesing (Gjesing) March 2, 2005 11:36

Hi, I can now run on to PC
 
Hi,

I can now run on to PC's, but I am not satisfied with my solution so far.

I have two identical users on the server and client, let us say their are named myUser. So both on the server and the client there exists a directory /home/myUser.

To get mpirun to run I have mounted server:/home/myUser as /home/myUser on the client. This is of course not so ideal, I think.

How can I mount so the application can run without installing OpenFOAM on all the nodes. Any suggestions?

/Rasmus

P.S. Got a 1.3 speedup for 2 nodes with normal network and one of the nodes being a laptop?!

Mattijs Janssens (Mattijs) March 2, 2005 12:59

Hi Rasmus about your 1.3 s
 
Hi Rasmus

about your 1.3 speedup: that does not seem very surprising. We once tested the comms speed on my laptop and it was nowhere near 100Mb/s. For decent network speed and large enough cases the speedup will be much higher.

Mattijs

Eugene de Villiers (Eugene) March 2, 2005 18:47

Running on a cluster or any o
 
Running on a cluster or any other parallel enviroment, you would generally use the following:

1. On ALL nodes, mount the partition with the user home directory via nfs. This does mean you must have the same user account on all nodes, but this can be easily accomplished via NIS if you have many nodes, or my via a GUI or editing the /etc/passwd file if you have two or three and dont feel comfortable with NIS. I dont know of any parallel code that uses a different method and this is the standard cluster setup.

2. On ALL nodes mount the partition with the OpenFOAM installation via nfs. The mount must have the same name on all nodes otherwise your startup script wont be able to find the OpenFOAM installation. Best practice is to use automounter to accomplish the nfs mountings, since it will automatically make softlinks for local partition mounts (see "man automount") which might otherwise cause problems. (of course, if OpenFOAM is installed IN your user directory or on the same partition, you will only need one nfs mount per node, the rest still applies)

3. Make sure you have passwordless ssh access to all nodes, including the master node. Passwordless ssh can be set up via ssh-keygen (see "man ssh-keygen")

Provided all machines meet the specs, the lot should work.

kärrholm March 9, 2005 10:31

I used to be able to run cases
 
I used to be able to run cases in paralell using suse, but when switching to debian I got the same error as Rasmus Gjesing, namely:

--> FOAM FATAL IO ERROR : Istream not OK for reading dictionary

file: ../LesSmallCavityFine.p/system/decomposeParDict at line 1.

Function: dictionary::read(Istream&, const word&)
in file: db/dictionary/dictionaryIO.C at line: 44.


Did you ever find out what caused this? I have my files mounted using NIS, so they are the same on all three computers.

/Fabian

gjesing March 9, 2005 10:49

Hi Fabian, Yes, I solved my
 
Hi Fabian,

Yes, I solved my problem. First of all my nfs-service wasn't running ( minor detail ;-) ). Then I also created the same user on all the nodes and on the server, and mounted my home-directory from the server as the home-directory on each node. Then the nodes has access to all the files they need.

/Rasmus

rolando September 14, 2005 07:32

Hello everyone, I´ve got some
 
Hello everyone,
I´ve got some problems, running OpenFOAM in parallel. Maybe someone can give me a hint.
I´ve written some routines, which work quite good on a single processor but make some problems when I try running them in parallel.
I´m working with a GeometricField and I want to determ it´s size and do some operations with each of its elements. Therefore I use GeometricField.size() and forAll(GeometricField, ele){ ... } (which uses also the size method, if I´m right).
The problem is, that the size, which is determined in parallel is much to small (about that size one would expect on just one processor).
Am I doing something wrong?

Rolando

hjasak September 14, 2005 07:49

You should rewrite your code f
 
You should rewrite your code for parallel:

In the domain decomposition mode, each processor only sees its own bit of the field or the mesh. Also, you cannot rely on data numbering (points, faces, cells, etc) because there is no mapping available between the local numbers on the processor and global numbering (unless you really really know what you're doing).

Hrv

rolando September 14, 2005 08:07

Thanks for the hint Hrvoje, I
 
Thanks for the hint Hrvoje,
I think I´ve got it now. I had to remember my little MPI knowledge. I used the reduce( ... ) operation for determing the total field size. Now it works.

Rolando

quinnr August 8, 2006 08:06

about your 1.3 speedup: that d
 
about your 1.3 speedup: that does not seem very surprising. We once tested the comms speed on my laptop and it was nowhere near 100Mb/s. For decent network speed and large enough cases the speedup will be much higher.

Hi, we're busy testing OpenFOAM 1.3 for use in high temperature metallurgy applications.

We have an existing cluster of four P4 2.2GHz nodes on a 100Mbps Ethernet network. My initial testing suggests that we're only going to start seeing a speed gain by using multiple nodes on very large problems - trying the default tutorial cases icoFoam/cavity and interFoam/damBreak in parallel on two nodes results in dismally slow performance, many times slower than solving locally using a single node.

Is this pretty typical for a slow interconnect like Fast Ethernet?

mattijs August 9, 2006 02:24

Yes, we found the same. -lapto
 
Yes, we found the same. -laptops have really bad networking. They do support the standard (e.g. 100 Mb/s) but the obtainable throughput is nowhere near that number. -small cases require low latency interconnects.

You can play around with the scheduledTransfer, floatTransfer settings.

Better is to speed up your interconnect. Gigabit ethernet is cheap. Then you can look at some of the higher (than LAM) performance MPI implementations. Lowest latency public domain one is MPI/GAMMA (very intrusive though). A commercial low latency implementation I know is Scali-MPI. OpenMPI can do channel bonding (send through multiple ports at the same time) which helps bandwidth.

Beyond this there are the specialised interconnects. Very good but expensive.

quinnr August 10, 2006 00:19

Thank you for the feedback Mat
 
Thank you for the feedback Mattijs.

I suspected that the latency would be an issue for small cases, I just wasn't sure whether the impact was that severe.

I think as a temporary measure we'll stick to running smaller cases in series or on the dual processor node, and rope in the cluster for very large cases that will need to run for days+.

I'll start budgeting for some better interconnects too http://www.cfd-online.com/OpenFOAM_D...part/happy.gif

Kind regards,
Quinn

fra76 July 5, 2007 02:30

The message says that you have
 
The message says that you have an error on line "1", so before the part that you posted here.
Your decomposeParDict file should start with:


/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

FoamFile
{
version 2.0;
format ascii;

root "";
case "";
instance "";
local "";

class dictionary;
object decomposeParDict;


// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

}


Check line 1, and post the whole file if you cannot find the error!

Francesco

hani July 5, 2007 03:02

Do you include the following l
 
Do you include the following line in your PBS script:

cd $PBS_O_WORKDIR

If not, you will not run your job from where you submit it.

You can also try specifying the whole path to your case instead of just '.' or '..'

Håkan.

cedric_duprat July 16, 2007 06:25

Hi all, I hope not "cutting"
 
Hi all,
I hope not "cutting" the previous discussion there ...
I have a problem with my parallel runs.
I save the result every 10 time step (for exemple) and it woks correctly up to there (exept continuity errors 10e-13) then, OF write the result fiels (I think) and then I got this message:
FOAM FATAL ERROR : NO_READ specified for read-constructor of object R of class IOobject
If I change the interval write to 15, the mistake arrive after 15 iterations ...
I forgot something ?, Some one has an idea ?
Thanks for your help.

Cedric

ville September 29, 2007 10:49

Hi, I'm struggling with the f
 
Hi,
I'm struggling with the following problem: in a
parallel computing system a parallel simulation
has to be started from a "login node". However,
I would like to put my case directory on a partition that is "project directory" which
is directly visible to the login node.

1) if I run icoFoam on single processor this works
2) however, if I try to mpirun the same with
mpirun .... it does not work but the following
error occurs:

Cannot read "/home/u2/..../cavity/system/decomposeParDict"

The path in the declaration goes right but
how
should I setup the environment in order to make
OF see the other partition?

Regards,
Ville

jonmec October 10, 2007 23:51

Hello .... I´m a new user a
 
Hello ....

I´m a new user and I´m trying to run my case in Parallel way.

I´ve already decomposed my case, but I had problems when I tried to start a LAM ....

please, someone help me!

what is the comand to start LAM?

cedric_duprat October 11, 2007 02:38

Hi Jonathas OpenFoam User G
 
Hi Jonathas

OpenFoam User Guide
3.4 Running applications in parallel
3.4.2 Running a decomposed cas
3.4.2.1 Starting a LAM multicomputer (U-82)

regards,
cedric

gwierink November 5, 2008 05:27

Hi everyone, I want to run
 
Hi everyone,

I want to run my case in parallel, first with just two nodes on my laptop. When I try to decompose the case with decomposePar I get the error

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.4.1 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : decomposePar . bubbleCellpar
Date : Nov 05 2008
Time : 12:10:41
Host :
PID : 15597
Root : /home/gijsbert/OpenFOAM/run
Case : bubbleCellpar
Nprocs : 1
Create time

Time = 0
Create mesh


Calculating distribution of cells
Selecting decompositionMethod simple


--> FOAM FATAL ERROR : Wrong number of processor divisions in geomDecomp:
Number of domains : 2
Wanted decomposition : (2 2 1)

From function geomDecomp::geomDecomp(const dictionary& decompositionDict)
in file geomDecomp/geomDecomp.C at line 53.

FOAM exiting


My composeParDict looks like this:

numberOfSubdomains 2;

method simple;

simpleCoeffs
{
n (1 2 1);
delta 0.001;
}

hierarchicalCoeffs
{
n (1 1 1);
delta 0.001;
order xyz;
}

metisCoeffs
{
processorWeights
(
1
1
);
}

manualCoeffs
{
dataFile "";
}

distributed no;

roots
(
);


From the error output it looks like I only use 1 processor while I try to divide the case over two. Can anyone help me on this problem?

Thank you in advance

sivakumar November 5, 2008 05:42

hi, can u tell me how man
 
hi,
can u tell me how many processor u r going to use????

sivakumar November 5, 2008 05:55

hey man, go to u r de
 
hey man,
go to u r decomposeParDict in system file, in that
simpleCoeffs
{
n (2 1 1);
delta 0.001;
}
...

then
metisCoeffs
{
processorWeights
(
1
1
);

just do this changes then try to decompose
i hope it will work.

by
siva

dmoroian November 5, 2008 09:51

Hello Gijsbert, It looks to m
 
Hello Gijsbert,
It looks to me that you've chosen to have 2 partitions, and the splitting algorithm is simple. However, you specified that the algorithm should split in 2 along x, 2 along y, and 1 along z, which makes 4 partitions.
Although you show the correct setting:
Quote:

simpleCoeffs
{
n (1 2 1);
delta 0.001;
}
the decomposePar sees differently:
Quote:

Wanted decomposition : (2 2 1)
so you probably modified in the wrong file...

I hope this is helpful,
Dragos

gwierink November 6, 2008 06:56

Hi guys, Many thanks for yo
 
Hi guys,

Many thanks for your replies.

@ siva:
I am trying to decompose my case to use the two cores of my dual core laptop, just to try if it runs. Soon I will get a quadcore, so I want to be able to do parallel runs. For that I copied the /system/decomposeParDict from the interFoam/damBreak tutorial into my case file and modified it to have 2 processors and a decomposition of (2 2 1). But somehow during the actual decomposition it does not work.

@ Dragos:
As described above I have modified the /system/decomposeParDict file in case directory and, as you write, during the decomposition that file is appearently not read, or read elsewhere.

Any ideas? Does decomposePar look for decomposeParDict in a different place than I think perhaps?

Rgds, Gijsbert

sivakumar November 6, 2008 07:38

hi, u have 2 processor, bu
 
hi,
u have 2 processor, but u r splitting u r geometry it to 4, so the the command it will not work.
in that

simpleCoeffs
{
n (1 2 1);
delta 0.001;
}

this is the right way,
if u tried, still u r getting the problem
attach u r problem here, " i mean what the computer says"

may be we can have a look.

by siva

gwierink November 6, 2008 09:50

Hi siva, Thank you for your
 
Hi siva,

Thank you for your reply. I did edit the decomposeParDict as you suggested, but apparently it did not want to save although I did do Ctrl+S. Perhaps it is most fool proof to actually close decomposeParDict before decomposing the case, so that it is saved for sure and no weird things happen. When I tried again today, everything worked fine! So I just ran my first parallel case successfully. Thanks for your quick replies.

Cheers, Gijs

eduardasilva March 11, 2009 08:38

Hello all, I have been runn
 
Hello all,

I have been running some cases with simpleFoam in a cylinder using parallel implementations in a Quad core processor. I would like to better understand when does the parallel communication occur between the 4 subdomains. Are the governing equations being solved in subdomain 1 and then being send to sudomain 2? How significant is it to use an implicit coupling method rather than an explicit one?

Thanks in advance,
Eduarda

prapanj March 11, 2009 08:59

Edura, The communication o
 
Edura,

The communication occurs at the end of each time step. All 4 processors have a copy of the program(binary). The binary can identify the processor number in which it is present. Using this, the processor understands what part of the domain it has to solve. It uses the flow field values along the boundaries of the adjacent subdomains(which are in adjacent processors) as its own boundary conditions. So after each timestep, the flow field values along the edges of the subdomains are exchanged. At the end of the simulations, the subdomains are composed back together to form the domain.

I have mentioned just one way of doing it. For better efficiency, cells may be distributed among processors in a round-robin too. But I hope you get a gist of what happens.

Refer this book : Parallel computing , by Barry Wilkinson and Michael Allen.

hellorishi March 12, 2009 08:19

Hello All, Does somebody ha
 
Hello All,

Does somebody have an example of decomposePar file using the "distributed yes;" and "roots" options?

I would like to use two nodes of a cluster to run OpenFOAM-1.5.x in parallel. I would like to use /tmp or /scratch of local cluster disks, instead of using the mounted ~/ to store the data.
I have enabled passwordless ssh.

Thanks in advance,
Rishi

tomislav_maric April 13, 2009 15:39

I'm trying to run damBreak tutorial in parallel on a HP 6820s dual core laptop. I've found this thread:

http://www.cfd-online.com/Forums/ope...-core-cpu.html

that states it's worth the trouble. I change the number of sub domains in "decomposeParDict" to 2. I'm using simple decomposition and set the coefficient "n" to (2 1 1). The "decomposePar" runs fine, telling me the number of processors is 1 (nProc: 1, with dual core?).

As a result I have two new directories: "processor0" and "processor1" with "0" and "constant" as their subdirectories. checkMesh tells me I have 2268 cells split in two on each "processor". The problem happens when I run "paraFoam -case processor0" (as read from page 64 in OF U-guide). Paraview starts fine, but when I try to import mesh data and click the Apply button, it shuts down and I get this error in console:

*** An error occurred in MPI_Bsend
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
[icarus:12344] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!


I have NO clue about parallel runs (or cpu architecture and its inner works) and I'm running most of my cases on this laptop so I wanted to speed things up at least a bit. What does this error mean? Also if I try

"mpirun -np 1 interFoam -parallel > log &"

it won't work, but

"mpirun -np 2 interFoam -parallel > log &"

runs fine (the results are written in directories "processor0" and "processor1"). Now, my question is: why does decomposePar tell me that I have nProc: 1 (number of processors 1) and creates processor0 and processor1 directories, while mpirun works only with the argument -np 2? Am I doing something wrong?

mattijs April 14, 2009 17:23

The MPI_BSend message sounds like a bug. Probably some boundary condition that does an extraneous parallel communication even when not running parallel. Try reconstructPar + postprocessing on the undecomposed case instead.

2) the "nProc:1" message I assume comes from the header. decomposePar runs on one processor only. It can decompose for any number of processors as given in the decomposeParDict.

tomislav_maric April 14, 2009 18:09

Quote:

Originally Posted by mattijs (Post 212889)
The MPI_BSend message sounds like a bug. Probably some boundary condition that does an extraneous parallel communication even when not running parallel. Try reconstructPar + postprocessing on the undecomposed case instead.

Thank You, I've tried it already and I've seen that it works fine on the damBreak case. I was worried because I have a case that's pretty expensive and I wanted to try a parallel run on my laptop first.

Quote:

Originally Posted by mattijs (Post 212889)
2) the "nProc:1" message I assume comes from the header. decomposePar runs on one processor only. It can decompose for any number of processors as given in the decomposeParDict.

I guess it creates "processor0" and "processor1" for two mesh sub-domains divided for one processor, but two cores? Again, I don't know enough details of comp. architecture to understand this yet, the important thing is that it seems to be working fine for two days now. I'm running interFoam on a pretty heavy case, without complaints, so far. :D

Thank You,

Tomislav

azb162 March 31, 2011 18:21

Problem running parallel job
 
Hi Foamers,

I have been trying to run biconic25-55Run35 tutorial on two processors.
I have used the simple decomposition scheme in the decomposepar utility to
decompose the mesh. When I run the decomposed case, it runs for a few time steps (5-10) and then it crashes. Can somebody help me debug this problem.

thanks ...


All times are GMT -4. The time now is 07:02.