Cluster OpenFOAM [Solved]
Importante: A velocidade da rede influencia muito a efetividade do cluster. Redes com longos caminhos a serem percorridos pelo sinal (como a da Unicamp) não são as mais indicadas. O bom eh ter uma rede a parte.
Estando os computadores em rede Linux. CONFIGURANDO O CLUSTER 1- Na versão 1.7.1 colocar a linha . /opt/openfoam171/etc/bashrc COMO PRIMEIRA LINHA nos arquivos sudo gedit /etc/profile , sudo gedit ~/.profile e gedit ~/.bashrc. Em todas as máquinas do cluster. 2- Dar o decomposePar em todas as máquinas. A parte do programa que usa os nós de lá, vai rodar lá mesmo, precisa de tudo lá, igual ao de cá. 5- O passo de tempo tem de ser o mesmo em todas as máquinas. Tudo, tanto no solver como no tutorial, tem de ser igual em todas as máquinas, pois o problema eh o mesmo. 6- No arquivo machines (que pode ser criado no tutorial) colocar o nome das máquinas seguidos de quantos processadores serão utilizados nesta máquina Exemplo ubuntu1 ubuntu2 cpu=2 7- Digitar mpirun --hostfiles machines -np <numero de processadores> <nome do solver> - parallel MONITORANDO OS PROCESSADORES DA REDE (para não precisar de um monitor em cada máquina) Puxando o sar-------------------------------------------------------------- sudo apt-get update sudo apt-get install atsar Aplicando o sar de 1 a 4 segundos------------------------------------------ sar 1 4 Lendo o relatório----------------------------------------------------------- 17:54:58 %usr %sys %wio %idle 17:55:08 30 57 1 12 17:55:18 29 57 1 12 17:55:28 26 43 1 29 Average 29 53 1 18 The output shows that the system spent 29% in user mode (your applications most likely), 53% in system mode (OS-related, e.g., CPU-comsuming libraries), and 1% waiting for IO requests, and was idle 18% of the time. If $usr + %sys = 100%, there may be a CPU bottleneck. Tem que entrar na máquina via ssh e acionar o sar, para ver se os processadores desta máquina estão trabalhando mesmo. O PÓS PROCESSAMENTO (reconstructPar) Mandei rodar 1 núcleo na máquina ubuntu1 e dois núcleos na máquina ubuntu2. No entanto, dei o decomposePar, para três processadores, nas duas. Como dei decomposePar para três processadores nas duas, as duas ficaram com as pastas processor0 processor1 e processor2. Como só rodou 1 núcleo na máquina ubuntu1, nesta máquina apenas a pasta processor0 estava cheia, as outras pastas processor1 e processor 2 estavam vazias e isso fez o reconstructPar dar erro. Na máquina ubuntu2, que roda dois núcleos, a pasta processor0 estava vazia e as pastas processor1 e processor2 estavam cheias, então, também deu erro chamar direto o reconstrucPar. A solução foi juntar as pastas cheias, pela própria rede, com o comando abaixo, onde passo a pasta processor0 para a máquina ubuntu2. Juntando todas as pastas cheias num só tutorial, então o reconstructPar dá certo. Depois dá um foamToVTK e usa o paraview normalmente. Tem que juntar as pastas cheias num lugar só, com um comando de rede similar ao abaixo. scp -r /home/fulanodetal/OpenFOAM/fulanodetal-1.7.1/run/tutorials/incompressible/icoFoam/cavity/processor0 fulanodetal@ubuntu2:/home/fulanodetal/OpenFOAM/fulanodetal-1.7.1/run/tutorials/incompressible/icoFoam/cavity/ Have fun! |
Summary in English, from Portuguese text above
1 - In version 1.7.1, put the line . / Opt/openfoam171/etc/bashrc AS FIRST LINE in the files sudo gedit /etc/profile , sudo gedit ~/.Profile and gedit ~/.Bashrc. In all machines in the network you will use.
2- In file machines (which can be created in the tutorial) put the machine name followed by how many processors are used on this machine. Example ubuntu1 ubuntu2 cpu = 2 3 - Give decomposePar in all machines. 4- Type mpirun --hostfiles machines -np <number of prpcessos> <solver name> - parallel MONITORING NETWORK PROCESSORS sudo apt-get update sudo apt-get install atsar Apply sar in machines by ssh usr shows time spent in user applications (like OpenFOAM) RECONTRUCPAR Put together the filled processor folders scattered in cluster's machines |
All files related to the problem must be equal, in solver and in tutorial. In all cluster machines.
And, yes. It Works! |
Nice!
Thank you guy! |
All people in my group have successfully build an OpenFOAM cluster with home computers, following these instructions.
|
An important observation must be made, the PCG algorithm, usually used
to calculate the pressure, is resistant to parallelization. We recommend the use of GAMG algorithm when the parallelization algorithm is used in more than one machine. This means the use of <variable> { solver GAMG; tolerance 1e-06; relTol 0.9; smoother GaussSeidel; cacheAgglomeration true; nCellsInCoarsestLevel 20; agglomerator faceAreaPair; mergeLevels 1; } Instead of <variable> { solver PCG; preconditioner DIC; tolerance 1e-06; relTol 0; } |
Hi
i want use two computer to run parallel. i have done told commands by Lex but this error is observed. what is the problem? Thanks in advance Quote:
|
This error often occurs when you do not put the line . /opt/openfoam171/etc/bashrc as the first line in the files: sudo gedit /etc/profile , sudo gedit ~/.profile and gedit ~/.bashrc
This is to be done in all machines you will run. |
But i think, in your case, the address of the solver is wrong.
Try mpirun --hostfiles machines -np <total numbers of processors> <solver name> - parallel Where machines is a .txt file in tutorial folder, that's contents only <name machine1> cpu=<number of processors in this machine> . . . <name machineN> cpu=<number of processors in this machine> |
Quote:
Thanks for your help. i connect two PCs with a cross cable. i have access to another by "ssh 192.168.1.1@maysam-desktop" command in terminal. is it other work should be done for network? |
If you are using Ubuntu, go to the Fyle System/etc/hostname
The file above content the name of your machine. This name have to be the same in machine files, that is a .txt file in the OpenFOAM tutorial of the solver. The solver will use this file to find the others machines in cluster. There is more details about machine file in this same topic. To make a basic network in Ubuntu is very easy, connect the machines using a Switch (is something very cheap nowadays). You may copy the text below in the hosts file in Fyle System/etc The text is, for example... And you network is done. <IP local machine> <name local machine > #core2duo 127.0.0.1 localhost.localdomain localhost 127.0.1.1 ubuntu.ubuntu-domain ubuntu <IP local machine> <name local machine> #core2duo <IP machine 2> <name machine 2> #notebook <IP machine 3> <name machine 3> #athlon # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts To get the IP of your machine type ifconfig in terminal. This have to be made in all machines of the cluster. Machines 2 and 3 are the others machines that is not the local machine, the machine that you are using in that moment. Pay attention because the computers may change the IP when you restart it. So you will have to change the current IP to the IP that you put in the text above. The IP can be chagen typing: ifconfig eth0 inet <IP that you want to be> |
Thanks Falcao
I have Ubuntu 10.10 on both machines. I have done all of your commands but the error is last error. What do you think about it? |
i have used two machines (laptop and desktop):
Quote:
One question is after restarting Ubuntu i have log in problem and i should change above files to last situation by "sudo nano ..." command. then i can enter to graphicaly ubuntu. the important problem is this error is seen when attempting to run: Quote:
192.168.1.1 maysam-desktop 192.168.1.2 maysam-laptop Also set ip same as above for both of them. Any help will be appreciated. |
Try to to run
mpirun --hostfile host.txt -np 8 icoFoam -parallel in the folder of the problem, for example maysam@maysam-desktop:~/OpenFOAM/maysam-1.7.0/run/tutorials/multiphase/bubblecolumn or Openfoam/maysam-1.7.0/run/tutorials/icofoam/cavity I think is missing the complet patch. |
icotest folder is cavity case and i was in that directory for run.
If i set all CPUs from one machine in host.txt the run will be started but when that file include both machines the error will be occurred. So it has network problem. |
i have access to another PC by "ssh 192.168.1.1" in terminal
|
Is there another identical machine file, with the cpu names and quantity of processors, in the other machine ?
|
it has another icotest folder in ~/OpenFOAM/maysam-1.7.0/run/icotest of another PC.
host.txt is in this folder. |
Is the patch of solver and tutorial the same in all PC´s ?
Did you note that the line is .<space>/opt... ? It shold be work. |
Quote:
i have installed OpenFoam on my Laptop by Upuntu Pack of openfoam.com and on desktop by source pack so source of software on Laptop is on /opt/openfoam170 and on desktop is on /home/maysam/openfoam1-7-0 is it your meaning? Quote:
|
May be this is your problem!
Usualy the Solver patch is /home/user/openfoam/user-1.7.0/applicatons/solvers/multiphase/bubblefoam And tutorial patch is usualy /home/user/openfoam/user-1.7.0/run/tutorials/multiphase/bubblefoam/case It must be something like this in all machines of cluster. |
Try to install openfoam from ubuntu pack in all machines. I think it will work.
|
Will it work in 1.5-dev?
Will this procedure work if substituting the name properly in OF1.5-dev?
I would try it myself, but I am waiting on my ethernet switch to come in. And I am looking for workarounds if there are issues. |
You Sir saved my day!!!! Thank you very much!!!!!!!!
|
1 Attachment(s)
Hi
I have been stuck on a problem related to simulation of a solidification problem on multiple processors. The problem is that at the pressure reference cell (pRefCell), I get a very weird behavior. Here is the liquid fraction contour when the simulation is performed on multiple processors. Attachment 22780 Similar problem has been reported on: (1) Poisson eq w setReference works serial diverges in parallel http://www.cfd-online.com/Forums/ope...-parallel.html (2) interfoam blows up on parallel run http://www.cfd-online.com/Forums/ope...allel-run.html (3) temperature anomaly at pressure reference cell http://www.cfd-online.com/Forums/ope...ence-cell.html What has been suggested in these posts is: (1) using GAMG solver instead of PCG as the pressure solver on parallel run, and (2) adjusting the fluxes after including the buoyancy term. I have applied both of the comments, although the second comment does not make a full sense for me, but still have the problem. Please let me know if you have any comments. |
All times are GMT -4. The time now is 04:06. |