CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   OpenFOAM (http://www.cfd-online.com/Forums/openfoam/)
-   -   Running in parallel (http://www.cfd-online.com/Forums/openfoam/60686-running-parallel.html)

Rasmus Gjesing (Gjesing) February 24, 2005 11:41

Hi, I am just testing the p
 
Hi,

I am just testing the parallel-feature on to nodes before setting up a hopefully bigger cluster.

But I ran into problems...

I use ssh with LAM and that works fine. tping returns correctly from my local and remote node.

But when I make a mpirun,
mpirun -np 2 icoFoam $FOAM_RUN/tutorials/icoFoam cavityPar -parallel < /dev/null >& log &
as written in the guide, it returns the following...

P.S. I have decomposed the case and the processor directories and files are present?

Any suggestions?

/Rasmus

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.0.2 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : icoFoam /home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam cavityPar -parallel
/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.0.2 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : icoFoam /home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam cavityPar -parallel
Date : Feb 24 2005
Time : 16:28:24
Host : serie020-lease-041.intern.ipl
PID : 7119
Date : Feb 24 2005
Time : 16:28:24
Host : foamcalculator.intern.ipl
PID : 7331
[1] Root : /home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam
[1] Case : cavityPar
[1] Nprocs : 2
[0] Root : /home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam
[0] Case : cavityPar
[0] Nprocs : 2
[0] Slaves :
1
(
foamcalculator.intern.ipl.7331
)

Create database


--> FOAM FATAL ERROR : icoFoam: Cannot open case directory "/home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam/cavityPar/processor1"


Function: argList::checkRootCase() const
in file: global/argList/argList.C at line: 511.Create mesh for time = 0


FOAM parallel run exiting

Fabian Peng Kärrholm (Kärrholm) February 25, 2005 04:49

This is from someone who just
 
This is from someone who just started his first paralell case a few minutes ago, but from the error message, it sounds like you haven't decomposed your case properly.
Have you checked that you have selected the same number of subdomains, processors etc? And that you have a polyMesh and a 0 directory in the processor0 and processor1 directories? And that they are all readable?

Rasmus Gjesing (Gjesing) February 25, 2005 05:09

I am pretty sure my decomposi
 
I am pretty sure my decomposition is ok, since I can run the decomposed case on one local processor. I think my problem is the connection to the remote eventough I can tping it and boot lam succesfully.

Are you also using ssh or rsh?

/Rasmus

P.S. I am using Redhat 9 if that adds extra knowledge?!

Mattijs Janssens (Mattijs) February 25, 2005 05:14

Hi Rasmus, Can your remote
 
Hi Rasmus,

Can your remote computer read your files? Is nfs working ok or maybe the protection bits are causing problems?

Mattijs

Rasmus Gjesing (Gjesing) February 25, 2005 06:34

Hi Mattijs, I think you on
 
Hi Mattijs,

I think you on the track of something, because my nfs was disabled, but now enabled, however still problems...

I can run the decomposed case on my local node, but the remote causes problems. I have just tried running the case ONLY on the remote and now this is the error I get.


--> FOAM FATAL IO ERROR : Istream not OK for reading dictionary

file: /home/rg/OpenFOAM/rg-1.0.2/run/tutorials/icoFoam/cavityPar/system/decomposeParDict at line 1.

Function: dictionary::read(Istream&, const word&)
in file: db/dictionary/dictionaryIO.C at line: 44.

FOAM parallel run exiting

So, it is the permissions that are teasing me.

How can I fix this?

Regards,
Rasmus


BTW. my decomposeParDict is like this...

numberOfSubdomains 2;

method simple;

simpleCoeffs
{
n (2 1 1);
delta 0.001;
}

hierarchicalCoeffs
{
n (2 1 1);
delta 0.001;
order xyz;
}

metisCoeffs
{
processorWeights
(
1
1
);
}

manualCoeffs
{
dataFile "";
}

distributed no;

roots
(
);

Rasmus Gjesing (Gjesing) March 2, 2005 12:36

Hi, I can now run on to PC
 
Hi,

I can now run on to PC's, but I am not satisfied with my solution so far.

I have two identical users on the server and client, let us say their are named myUser. So both on the server and the client there exists a directory /home/myUser.

To get mpirun to run I have mounted server:/home/myUser as /home/myUser on the client. This is of course not so ideal, I think.

How can I mount so the application can run without installing OpenFOAM on all the nodes. Any suggestions?

/Rasmus

P.S. Got a 1.3 speedup for 2 nodes with normal network and one of the nodes being a laptop?!

Mattijs Janssens (Mattijs) March 2, 2005 13:59

Hi Rasmus about your 1.3 s
 
Hi Rasmus

about your 1.3 speedup: that does not seem very surprising. We once tested the comms speed on my laptop and it was nowhere near 100Mb/s. For decent network speed and large enough cases the speedup will be much higher.

Mattijs

Eugene de Villiers (Eugene) March 2, 2005 19:47

Running on a cluster or any o
 
Running on a cluster or any other parallel enviroment, you would generally use the following:

1. On ALL nodes, mount the partition with the user home directory via nfs. This does mean you must have the same user account on all nodes, but this can be easily accomplished via NIS if you have many nodes, or my via a GUI or editing the /etc/passwd file if you have two or three and dont feel comfortable with NIS. I dont know of any parallel code that uses a different method and this is the standard cluster setup.

2. On ALL nodes mount the partition with the OpenFOAM installation via nfs. The mount must have the same name on all nodes otherwise your startup script wont be able to find the OpenFOAM installation. Best practice is to use automounter to accomplish the nfs mountings, since it will automatically make softlinks for local partition mounts (see "man automount") which might otherwise cause problems. (of course, if OpenFOAM is installed IN your user directory or on the same partition, you will only need one nfs mount per node, the rest still applies)

3. Make sure you have passwordless ssh access to all nodes, including the master node. Passwordless ssh can be set up via ssh-keygen (see "man ssh-keygen")

Provided all machines meet the specs, the lot should work.

kärrholm March 9, 2005 11:31

I used to be able to run cases
 
I used to be able to run cases in paralell using suse, but when switching to debian I got the same error as Rasmus Gjesing, namely:

--> FOAM FATAL IO ERROR : Istream not OK for reading dictionary

file: ../LesSmallCavityFine.p/system/decomposeParDict at line 1.

Function: dictionary::read(Istream&, const word&)
in file: db/dictionary/dictionaryIO.C at line: 44.


Did you ever find out what caused this? I have my files mounted using NIS, so they are the same on all three computers.

/Fabian

gjesing March 9, 2005 11:49

Hi Fabian, Yes, I solved my
 
Hi Fabian,

Yes, I solved my problem. First of all my nfs-service wasn't running ( minor detail ;-) ). Then I also created the same user on all the nodes and on the server, and mounted my home-directory from the server as the home-directory on each node. Then the nodes has access to all the files they need.

/Rasmus

rolando September 14, 2005 08:32

Hello everyone, I´ve got some
 
Hello everyone,
I´ve got some problems, running OpenFOAM in parallel. Maybe someone can give me a hint.
I´ve written some routines, which work quite good on a single processor but make some problems when I try running them in parallel.
I´m working with a GeometricField and I want to determ it´s size and do some operations with each of its elements. Therefore I use GeometricField.size() and forAll(GeometricField, ele){ ... } (which uses also the size method, if I´m right).
The problem is, that the size, which is determined in parallel is much to small (about that size one would expect on just one processor).
Am I doing something wrong?

Rolando

hjasak September 14, 2005 08:49

You should rewrite your code f
 
You should rewrite your code for parallel:

In the domain decomposition mode, each processor only sees its own bit of the field or the mesh. Also, you cannot rely on data numbering (points, faces, cells, etc) because there is no mapping available between the local numbers on the processor and global numbering (unless you really really know what you're doing).

Hrv

rolando September 14, 2005 09:07

Thanks for the hint Hrvoje, I
 
Thanks for the hint Hrvoje,
I think I´ve got it now. I had to remember my little MPI knowledge. I used the reduce( ... ) operation for determing the total field size. Now it works.

Rolando

quinnr August 8, 2006 09:06

about your 1.3 speedup: that d
 
about your 1.3 speedup: that does not seem very surprising. We once tested the comms speed on my laptop and it was nowhere near 100Mb/s. For decent network speed and large enough cases the speedup will be much higher.

Hi, we're busy testing OpenFOAM 1.3 for use in high temperature metallurgy applications.

We have an existing cluster of four P4 2.2GHz nodes on a 100Mbps Ethernet network. My initial testing suggests that we're only going to start seeing a speed gain by using multiple nodes on very large problems - trying the default tutorial cases icoFoam/cavity and interFoam/damBreak in parallel on two nodes results in dismally slow performance, many times slower than solving locally using a single node.

Is this pretty typical for a slow interconnect like Fast Ethernet?

mattijs August 9, 2006 03:24

Yes, we found the same. -lapto
 
Yes, we found the same. -laptops have really bad networking. They do support the standard (e.g. 100 Mb/s) but the obtainable throughput is nowhere near that number. -small cases require low latency interconnects.

You can play around with the scheduledTransfer, floatTransfer settings.

Better is to speed up your interconnect. Gigabit ethernet is cheap. Then you can look at some of the higher (than LAM) performance MPI implementations. Lowest latency public domain one is MPI/GAMMA (very intrusive though). A commercial low latency implementation I know is Scali-MPI. OpenMPI can do channel bonding (send through multiple ports at the same time) which helps bandwidth.

Beyond this there are the specialised interconnects. Very good but expensive.

quinnr August 10, 2006 01:19

Thank you for the feedback Mat
 
Thank you for the feedback Mattijs.

I suspected that the latency would be an issue for small cases, I just wasn't sure whether the impact was that severe.

I think as a temporary measure we'll stick to running smaller cases in series or on the dual processor node, and rope in the cluster for very large cases that will need to run for days+.

I'll start budgeting for some better interconnects too http://www.cfd-online.com/OpenFOAM_D...part/happy.gif

Kind regards,
Quinn

fra76 July 5, 2007 03:30

The message says that you have
 
The message says that you have an error on line "1", so before the part that you posted here.
Your decomposeParDict file should start with:


/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.3 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

FoamFile
{
version 2.0;
format ascii;

root "";
case "";
instance "";
local "";

class dictionary;
object decomposeParDict;


// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

}


Check line 1, and post the whole file if you cannot find the error!

Francesco

hani July 5, 2007 04:02

Do you include the following l
 
Do you include the following line in your PBS script:

cd $PBS_O_WORKDIR

If not, you will not run your job from where you submit it.

You can also try specifying the whole path to your case instead of just '.' or '..'

Håkan.

cedric_duprat July 16, 2007 07:25

Hi all, I hope not "cutting"
 
Hi all,
I hope not "cutting" the previous discussion there ...
I have a problem with my parallel runs.
I save the result every 10 time step (for exemple) and it woks correctly up to there (exept continuity errors 10e-13) then, OF write the result fiels (I think) and then I got this message:
FOAM FATAL ERROR : NO_READ specified for read-constructor of object R of class IOobject
If I change the interval write to 15, the mistake arrive after 15 iterations ...
I forgot something ?, Some one has an idea ?
Thanks for your help.

Cedric

ville September 29, 2007 11:49

Hi, I'm struggling with the f
 
Hi,
I'm struggling with the following problem: in a
parallel computing system a parallel simulation
has to be started from a "login node". However,
I would like to put my case directory on a partition that is "project directory" which
is directly visible to the login node.

1) if I run icoFoam on single processor this works
2) however, if I try to mpirun the same with
mpirun .... it does not work but the following
error occurs:

Cannot read "/home/u2/..../cavity/system/decomposeParDict"

The path in the declaration goes right but
how
should I setup the environment in order to make
OF see the other partition?

Regards,
Ville


All times are GMT -4. The time now is 03:18.