CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Running, Solving & CFD (https://www.cfd-online.com/Forums/openfoam-solving/)
-   -   Cluster OpenFOAM [Solved] (https://www.cfd-online.com/Forums/openfoam-solving/81235-cluster-openfoam-solved.html)

falcao October 20, 2010 13:54

Cluster OpenFOAM [Solved]
 
Importante: A velocidade da rede influencia muito a efetividade do cluster. Redes com longos caminhos a serem percorridos pelo sinal (como a da Unicamp) não são as mais indicadas. O bom eh ter uma rede a parte.


Estando os computadores em rede Linux.


CONFIGURANDO O CLUSTER

1- Na versão 1.7.1 colocar a linha . /opt/openfoam171/etc/bashrc COMO PRIMEIRA LINHA nos arquivos sudo gedit /etc/profile , sudo gedit ~/.profile e gedit ~/.bashrc. Em todas as máquinas do cluster.

2- Dar o decomposePar em todas as máquinas. A parte do programa que usa os nós de lá, vai rodar lá mesmo, precisa de tudo lá, igual ao de cá.

5- O passo de tempo tem de ser o mesmo em todas as máquinas. Tudo, tanto no solver como no tutorial, tem de ser igual em todas as máquinas, pois o problema eh o mesmo.

6- No arquivo machines (que pode ser criado no tutorial) colocar o nome das máquinas seguidos de quantos processadores serão utilizados nesta máquina

Exemplo

ubuntu1
ubuntu2 cpu=2

7- Digitar

mpirun --hostfiles machines -np <numero de processadores> <nome do solver> - parallel



MONITORANDO OS PROCESSADORES DA REDE (para não precisar de um monitor em cada máquina)

Puxando o sar--------------------------------------------------------------

sudo apt-get update


sudo apt-get install atsar


Aplicando o sar de 1 a 4 segundos------------------------------------------

sar 1 4


Lendo o relatório-----------------------------------------------------------


17:54:58 %usr %sys %wio %idle
17:55:08 30 57 1 12
17:55:18 29 57 1 12
17:55:28 26 43 1 29

Average 29 53 1 18

The output shows that the system spent 29% in user mode (your applications most likely), 53% in system
mode (OS-related, e.g., CPU-comsuming libraries), and 1% waiting for IO requests, and
was idle 18% of the time. If $usr + %sys = 100%, there may be a CPU bottleneck.

Tem que entrar na máquina via ssh e acionar o sar, para ver se os processadores desta máquina estão trabalhando mesmo.


O PÓS PROCESSAMENTO (reconstructPar)


Mandei rodar 1 núcleo na máquina ubuntu1 e dois núcleos na máquina ubuntu2. No entanto, dei o decomposePar, para três processadores, nas duas. Como dei decomposePar para três processadores nas duas, as duas ficaram com as pastas processor0 processor1 e processor2.

Como só rodou 1 núcleo na máquina ubuntu1, nesta máquina apenas a pasta processor0 estava cheia, as outras pastas processor1 e processor 2 estavam vazias e isso fez o reconstructPar dar erro. Na máquina ubuntu2, que roda dois núcleos, a pasta processor0 estava vazia e as pastas processor1 e processor2 estavam cheias, então, também deu erro chamar direto o reconstrucPar. A solução foi juntar as pastas cheias, pela própria rede, com o comando abaixo, onde passo a pasta processor0 para a máquina ubuntu2. Juntando todas as pastas cheias num só tutorial, então o reconstructPar dá certo. Depois dá um foamToVTK e usa o paraview normalmente. Tem que juntar as pastas cheias num lugar só, com um comando de rede similar ao abaixo.


scp -r /home/fulanodetal/OpenFOAM/fulanodetal-1.7.1/run/tutorials/incompressible/icoFoam/cavity/processor0 fulanodetal@ubuntu2:/home/fulanodetal/OpenFOAM/fulanodetal-1.7.1/run/tutorials/incompressible/icoFoam/cavity/

Have fun!

Lexx October 20, 2010 20:38

Summary in English, from Portuguese text above
 
1 - In version 1.7.1, put the line . / Opt/openfoam171/etc/bashrc AS FIRST LINE in the files sudo gedit /etc/profile , sudo gedit ~/.Profile and gedit ~/.Bashrc. In all machines in the network you will use.

2- In file machines (which can be created in the tutorial) put the machine name followed by how many processors are used on this machine.

Example

ubuntu1
ubuntu2 cpu = 2

3 - Give decomposePar in all machines.

4- Type mpirun --hostfiles machines -np <number of prpcessos> <solver name> - parallel

MONITORING NETWORK PROCESSORS

sudo apt-get update

sudo apt-get install atsar

Apply sar in machines by ssh

usr shows time spent in user applications (like OpenFOAM)

RECONTRUCPAR

Put together the filled processor folders scattered in cluster's machines

Lexx October 20, 2010 20:41

All files related to the problem must be equal, in solver and in tutorial. In all cluster machines.

And, yes. It Works!

Cymbio October 20, 2010 21:55

Nice!

Thank you guy!

Cymbio October 21, 2010 09:26

All people in my group have successfully build an OpenFOAM cluster with home computers, following these instructions.

falcao February 14, 2011 07:56

An important observation must be made, the PCG algorithm, usually used
to calculate the pressure, is resistant to parallelization. We recommend the use of
GAMG algorithm when the parallelization algorithm is used in more than one machine.

This means the use of

<variable>
{
solver GAMG;
tolerance 1e-06;
relTol 0.9;
smoother GaussSeidel;
cacheAgglomeration true;
nCellsInCoarsestLevel 20;
agglomerator faceAreaPair;
mergeLevels 1;
}

Instead of

<variable>
{
solver PCG;
preconditioner DIC;
tolerance 1e-06;
relTol 0;
}

maysmech February 15, 2011 14:09

Hi
i want use two computer to run parallel.
i have done told commands by Lex but this error is observed.
what is the problem?
Thanks in advance

Quote:

maysam@maysam-desktop:~/OpenFOAM/maysam-1.7.0/run/icotest$ mpirun --hostfiles /host -np 8 icoFoam -parallel
--------------------------------------------------------------------------
mpirun was unable to launch the specified application as it could not find an executable:

Executable: --hostfiles
Node: maysam-desktop

while attempting to start process rank 0.

falcao February 15, 2011 18:01

This error often occurs when you do not put the line . /opt/openfoam171/etc/bashrc as the first line in the files: sudo gedit /etc/profile , sudo gedit ~/.profile and gedit ~/.bashrc

This is to be done in all machines you will run.

falcao February 15, 2011 18:32

But i think, in your case, the address of the solver is wrong.

Try

mpirun --hostfiles machines -np <total numbers of processors> <solver name> - parallel

Where machines is a .txt file in tutorial folder, that's contents only

<name machine1> cpu=<number of processors in this machine>
.
.
.
<name machineN> cpu=<number of processors in this machine>

maysmech February 16, 2011 08:09

Quote:

This error often occurs when you do not put the line . /opt/openfoam171/etc/bashrc as the first line in the files: sudo gedit /etc/profile , sudo gedit ~/.profile and gedit ~/.bashrc

This is to be done in all machines you will run.
Dear Falco,
Thanks for your help.
i connect two PCs with a cross cable. i have access to another by "ssh 192.168.1.1@maysam-desktop" command in terminal. is it other work should be done for network?

falcao February 16, 2011 09:48

If you are using Ubuntu, go to the Fyle System/etc/hostname

The file above content the name of your machine. This name have to be the same in machine files, that is a .txt file in the OpenFOAM tutorial of the solver. The solver will use this file to find the others machines in cluster. There is more details about machine file in this same topic.

To make a basic network in Ubuntu is very easy, connect the machines using a Switch (is something very cheap nowadays). You may copy the text below in the hosts file in Fyle System/etc

The text is, for example... And you network is done.

<IP local machine> <name local machine > #core2duo

127.0.0.1 localhost.localdomain localhost
127.0.1.1 ubuntu.ubuntu-domain ubuntu

<IP local machine> <name local machine> #core2duo
<IP machine 2> <name machine 2> #notebook
<IP machine 3> <name machine 3> #athlon

# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

To get the IP of your machine type ifconfig in terminal. This have to be made in all machines of the cluster. Machines 2 and 3 are the others machines that is not the local machine, the machine that you are using in that moment.

Pay attention because the computers may change the IP when you restart it. So you will have to change the current IP to the IP that you put in the text above.

The IP can be chagen typing: ifconfig eth0 inet <IP that you want to be>

maysmech February 16, 2011 15:36

Thanks Falcao
I have Ubuntu 10.10 on both machines.
I have done all of your commands but the error is last error.
What do you think about it?

maysmech February 17, 2011 03:53

i have used two machines (laptop and desktop):
Quote:

maysam-desktop cpu = 6
maysam-laptop cpu = 2
Also, add to three above files ". /opt/... " in each machine.
One question is after restarting Ubuntu i have log in problem and i should change above files to last situation by "sudo nano ..." command. then i can enter to graphicaly ubuntu.

the important problem is this error is seen when attempting to run:
Quote:

maysam@maysam-desktop:~/OpenFOAM/maysam-1.7.0/run/icotest$ mpirun --hostfile host.txt -np 8 icoFoam -parallel
maysam@maysam-laptop's password:
--------------------------------------------------------------------------
mpirun was unable to launch the specified application as it could not find an executable:

Executable: icoFoam
Node: maysam-laptop

while attempting to start process rank 6.
--------------------------------------------------------------------------
maysam@maysam-desktop:~/OpenFOAM/maysam-1.7.0/run/icotest$
I added to /etc/hosts of both machines:
192.168.1.1 maysam-desktop
192.168.1.2 maysam-laptop
Also set ip same as above for both of them.

Any help will be appreciated.

falcao February 18, 2011 09:41

Try to to run

mpirun --hostfile host.txt -np 8 icoFoam -parallel

in the folder of the problem, for example

maysam@maysam-desktop:~/OpenFOAM/maysam-1.7.0/run/tutorials/multiphase/bubblecolumn

or Openfoam/maysam-1.7.0/run/tutorials/icofoam/cavity

I think is missing the complet patch.

maysmech February 18, 2011 09:49

icotest folder is cavity case and i was in that directory for run.

If i set all CPUs from one machine in host.txt the run will be started but when that file include both machines the error will be occurred.

So it has network problem.

maysmech February 18, 2011 09:52

i have access to another PC by "ssh 192.168.1.1" in terminal

falcao February 18, 2011 11:45

Is there another identical machine file, with the cpu names and quantity of processors, in the other machine ?

maysmech February 18, 2011 12:30

it has another icotest folder in ~/OpenFOAM/maysam-1.7.0/run/icotest of another PC.

host.txt is in this folder.

falcao February 18, 2011 13:15

Is the patch of solver and tutorial the same in all PC´s ?

Did you note that the line is .<space>/opt... ?

It shold be work.

maysmech February 18, 2011 13:41

Quote:

Originally Posted by falcao (Post 295972)
Is the patch of solver and tutorial the same in all PC´s ?

What do you mean about solver and tutorial patch?
i have installed OpenFoam on my Laptop by Upuntu Pack of openfoam.com and on desktop by source pack so source of software on Laptop is on /opt/openfoam170 and on desktop is on /home/maysam/openfoam1-7-0
is it your meaning?

Quote:

Did you note that the line is .<space>/opt... ?
Yes.

falcao February 18, 2011 14:02

May be this is your problem!

Usualy the Solver patch is /home/user/openfoam/user-1.7.0/applicatons/solvers/multiphase/bubblefoam

And tutorial patch is usualy

/home/user/openfoam/user-1.7.0/run/tutorials/multiphase/bubblefoam/case

It must be something like this in all machines of cluster.

falcao February 18, 2011 14:14

Try to install openfoam from ubuntu pack in all machines. I think it will work.

lordvon November 7, 2011 13:04

Will it work in 1.5-dev?
 
Will this procedure work if substituting the name properly in OF1.5-dev?

I would try it myself, but I am waiting on my ethernet switch to come in. And I am looking for workarounds if there are issues.

tbsetala February 15, 2013 09:26

You Sir saved my day!!!! Thank you very much!!!!!!!!

Yahoo June 17, 2013 21:57

1 Attachment(s)
Hi
I have been stuck on a problem related to simulation of a solidification problem on multiple processors. The problem is that at the pressure reference cell (pRefCell), I get a very weird behavior. Here is the liquid fraction contour when the simulation is performed on multiple processors.

Attachment 22780

Similar problem has been reported on:
(1) Poisson eq w setReference works serial diverges in parallel
http://www.cfd-online.com/Forums/ope...-parallel.html

(2) interfoam blows up on parallel run
http://www.cfd-online.com/Forums/ope...allel-run.html

(3) temperature anomaly at pressure reference cell
http://www.cfd-online.com/Forums/ope...ence-cell.html
What has been suggested in these posts is: (1) using GAMG solver instead of PCG as the pressure solver on parallel run, and (2) adjusting the fluxes after including the buoyancy term. I have applied both of the comments, although the second comment does not make a full sense for me, but still have the problem.
Please let me know if you have any comments.


All times are GMT -4. The time now is 04:06.