CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

Two computer cluster - problems with settings

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   September 3, 2016, 20:16
Default Two computer cluster - problems with settings
  #1
Member
 
Darko Radenkovic
Join Date: Oct 2015
Posts: 38
Rep Power: 10
dradenkovic is on a distinguished road
Hello.

I believe I read every post that concerns creating small cluster on this forum, but again I failed to set my two computers into cluster.

In my case server hostname is darko, and client hostname is dradenkovic. Both computers have same username - dradenkovic. I used NFS. SSH didn't have problems.
I defined /etc/hosts with adresses on both computers.
Settings on server:
--------------------------------------------------------------------------------------------------
/etc/fstab:
/home/dradenkovic/OpenFOAM /export/OpenFOAM none bind 0 0
#(adress of Openfoam installation)
-----------------------------------------------------------------------------------------------------
/etc/exports:
/export/OpenFOAM <client adress>(rw,nohide,insecure,no_subtree_check,async)
----------------------------------------------------------------------------------------------------

Settings on client computer:
---------------------------------------------------------------
/etc/fstab:
darko:/export/OpenFOAM /home/dradenkovic/ nfs 0 0

#(Here I tried various settings, non of them worked)
---------------------------------------------------------------
Sometimes it looks like there is some conflict, so I can't even log into my client computer.

From everything that I read, I believe that I need to create identical file structure on both computers. I have to check this.

If on server, location of OpenFOAM is /home/dradenkovic/OpenFOAM, is it necessary that on client computer location of OpenFOAM is the same as on the server (i.e. is it the route on client /home/dradenkovic/OpenFOAM, in my case)? In that case, what should I try in above settings?

Is there any possibility to mount OpenFOAM directory into some other folder, for example /mnt? I am afraid, because whole day, whenever I tried to run example case, I was obtaining error:
HTML Code:
--------------------------------------------------------------------------
mpirun was unable to find the specified executable file, and therefore
did not launch the job.  This error was first reported for process
rank 12; it may have occurred for other processes as well.

NOTE: A common cause for this error is misspelling a mpirun command
      line parameter option (remember that mpirun interprets the first
      unrecognized command line token as the executable).

Node:       dradenkovic
Executable: /home/dradenkovic/OpenFOAM/OpenFOAM-dev/platforms/linux64GccDPInt32Opt/bin/pisoFoam
--------------------------------------------------------------------------
12 total processes failed to start
I have 12 cores per computer. Node dradenkovic in upper message is client node.

Here is decomposeParDict


HTML Code:
/*--------------------------------*- C++ -*----------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  dev                                   |
|   \\  /    A nd           | Web:      www.OpenFOAM.org                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
FoamFile
{
    version     2.0;
    format      ascii;
    class       dictionary;
    location    "system";
    object      decomposeParDict;
}
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

numberOfSubdomains 24;

method          simple;

simpleCoeffs
{
    n               (24 1 1);
    delta           0.001;
}


   
  hierarchicalCoeffs 
  { 
      n               (1 1 1); 
      delta           0.001; 
      order           xyz; 
  } 
   
 manualCoeffs 
 { 
      dataFile        ””; 
 } 
 
 
 distributed     no; 
  
roots           ( );
Any advice is appreciated.

Best regards,
Darko
dradenkovic is offline   Reply With Quote

Old   November 9, 2017, 03:35
Default
  #2
Member
 
chengan.wang
Join Date: Jan 2016
Location: china
Posts: 47
Rep Power: 10
wangchengan2003 is on a distinguished road
Send a message via Skype™ to wangchengan2003
Hello Darko,
I met similar problem like you.
I tried to run the case icoFoam by two nodes, debian1(16 cpu) and debian2 (16 cpu) respectively. Finally I got the errors as:

tanjianyu@debian1:~/OpenFOAM/tanjianyu-v1706/run/tutorials/incompressible/icoFoam/cavity/cavity$ tail -f log
rank 16; it may have occurred for other processes as well.

NOTE: A common cause for this error is misspelling a mpirun command line parameter option (remember that mpirun interprets the first
unrecognized command line token as the executable).

Node: debian2
Executable: /home/tanjianyu/OpenFOAM/OpenFOAM-v1706/platforms/linux64GccDPInt32Opt/bin/icoFoam
--------------------------------------------------------------------------
16 total processes failed to start

Have you solved it? Could you give me some advice?
Best regards,
Chengan
wangchengan2003 is offline   Reply With Quote

Old   November 9, 2017, 05:19
Default
  #3
Member
 
Darko Radenkovic
Join Date: Oct 2015
Posts: 38
Rep Power: 10
dradenkovic is on a distinguished road
Hello Chengan,

I solved the problem, but I didn't use the elegant way. If I miss something, do not mind me, it's been a while since I solved this.

1. I did not use NFS. I gave up on that. I had the two identical systems (same files with same routes), with the same user names.
2. From some reason, in .bashrc instead of the end, put
source $HOME/OpenFOAM/OpenFOAM-dev/etc/bashrc
at the beginning of the file, on both computers.
3. You can define IP addresses in /etc/hosts, with the first one
127.0.0.1
(I am not sure for part with bold letters; check somewhere else; if you find, correct me in other post)
4. Define number of cores in file machines
5. run mpirun ...

Again, not elegant, but it works.

Good luck,
Darko
dradenkovic is offline   Reply With Quote

Old   November 10, 2017, 04:59
Default
  #4
Member
 
chengan.wang
Join Date: Jan 2016
Location: china
Posts: 47
Rep Power: 10
wangchengan2003 is on a distinguished road
Send a message via Skype™ to wangchengan2003
Dear Darko,
Thank you very much for your help.
Actually,I had the two identical systems (same files with same routes), with the same user names like you. I try to use your method but it dosen't work on my two computers.

If there is problem with openfoam for some version? There was no problem with mpi and nfs.

Thank you again.

Best regards,
Chengan
wangchengan2003 is offline   Reply With Quote

Old   November 10, 2017, 07:21
Default
  #5
Member
 
Darko Radenkovic
Join Date: Oct 2015
Posts: 38
Rep Power: 10
dradenkovic is on a distinguished road
Chengan,

I use this principle and I it works. I don't believe that the error depends on
the version of OpenFoam but if we don't succeed, you can try some different OpenFoam version.

Try to run some other tutorial case in parallel. For example, channel395.

Can you copy files from one computer to another without password? Routes, OpenFoam files, case files, user names, source OpenFoam on both computers in the first line - absolutely everything is identical?

Can you show your etc/hosts and machines files? Command that you use to start job? Complete error log?



Regards,
Darko
dradenkovic is offline   Reply With Quote

Old   November 11, 2017, 02:13
Default
  #6
Member
 
chengan.wang
Join Date: Jan 2016
Location: china
Posts: 47
Rep Power: 10
wangchengan2003 is on a distinguished road
Send a message via Skype™ to wangchengan2003
Dear Darko

I can copy files from one computer to another without password. Absolutely everything is identical on the two nodes.

The etc/hosts is :
Code:
127.0.0.1    localhost
10.246.251.4    ubuntu1 
10.246.251.5    ubuntu2

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
The command is :
Code:
#!/bin/sh
cd ${0%/*} || exit 1    # Run from this directory

# Source tutorial run functions
. $WM_PROJECT_DIR/bin/tools/RunFunctions

runApplication blockMesh

runApplication decomposePar

mpirun --hostfile machines -np 32 $(getApplication) -parallel > log & 
  
runApplication reconstructPar
The error log is :
Code:
/*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  v1706                                 |
|   \\  /    A nd           | Web:      www.OpenFOAM.com                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : v1706
Arch   : "LSB;label=32;scalar=64"
Exec   : icoFoam -parallel
Date   : Nov 11 2017
Time   : 08:34:42
Host   : "ubuntu1"
PID    : 4808
Case   : /home/tanjianyu/OpenFOAM/tanjianyu-v1706/run/tutorials/incompressible/icoFoam/cavity/cavity
nProcs : 32
Slaves : 
31
(
"ubuntu1.4809"
"ubuntu1.4810"
"ubuntu1.4811"
"ubuntu1.4812"
"ubuntu1.4813"
"ubuntu1.4814"
"ubuntu1.4815"
"ubuntu1.4816"
"ubuntu1.4817"
"ubuntu1.4818"
"ubuntu1.4819"
"ubuntu1.4821"
"ubuntu1.4830"
"ubuntu1.4831"
"ubuntu1.4836"
"ubuntu2.22649"
"ubuntu2.22650"
"ubuntu2.22651"
"ubuntu2.22652"
"ubuntu2.22653"
"ubuntu2.22654"
"ubuntu2.22655"
"ubuntu2.22656"
"ubuntu2.22657"
"ubuntu2.22658"
"ubuntu2.22659"
"ubuntu2.22660"
"ubuntu2.22661"
"ubuntu2.22662"
"ubuntu2.22663"
"ubuntu2.22664"
)

Pstream initialized with:
    floatTransfer      : 0
    nProcsSimpleSum    : 0
    commsType          : nonBlocking
    polling iterations : 0
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10)
allowSystemOperations : Allowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Create mesh for time = 0

--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 18 in communicator MPI_COMM_WORLD 
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
I also attached the case in the end.

Thank you very much.

Chengan
Attached Files
File Type: zip cavity.zip (9.8 KB, 8 views)
wangchengan2003 is offline   Reply With Quote

Old   November 11, 2017, 08:41
Default
  #7
Member
 
Darko Radenkovic
Join Date: Oct 2015
Posts: 38
Rep Power: 10
dradenkovic is on a distinguished road
Chengan,

Can you try step by step? On one computer: in your cavity case, run blockMesh, then run decomposePar. Then copy that whole case to the other computer, to the same route as in the first computer. Find on which computer you need to source RunFunctions (I haven't done that, I don't know). Case has to be identical (and decomposed) on both computers.

Then run mpirun command
mpirun -machines -np 32 icoFoam -parallel

Regards,
Darko
dradenkovic is offline   Reply With Quote

Old   November 12, 2017, 01:41
Default
  #8
Member
 
chengan.wang
Join Date: Jan 2016
Location: china
Posts: 47
Rep Power: 10
wangchengan2003 is on a distinguished road
Send a message via Skype™ to wangchengan2003
Dear Darko

Thank you very much for your help. Finally I succeed!!
I have tried your method step and step. Finally I got 16 results in ubuntu1 and the other 16 results in ubuntu2. I should copy them together and run the command reconstructPar. So I try to use nfs again and put
Code:
source $HOME/OpenFOAM/OpenFOAM-dev/etc/bashrc
in the first line of bashrc file. It works very well now.

I still have a question. If we could write the command
Code:
mpirun --hostfile machines -np 32 $(getApplication) -parallel > log &
in the shell with elegant way like
Code:
runParallel $(getApplication)
?

Thank you again!

Best regards

Chengan


wangchengan2003 is offline   Reply With Quote

Old   November 12, 2017, 05:23
Default
  #9
Member
 
Darko Radenkovic
Join Date: Oct 2015
Posts: 38
Rep Power: 10
dradenkovic is on a distinguished road
Dear Changan,

I am glad that you succeeded.

I don't know answer to your question, but you can experiment now.

Regards,
Darko
dradenkovic is offline   Reply With Quote

Old   November 13, 2017, 04:16
Default
  #10
Member
 
Ricky
Join Date: Jul 2014
Location: Germany
Posts: 78
Rep Power: 11
kera is on a distinguished road
Hallo Chengan,

If I understood your question correctly then there is a way of doing that.

in your .bashrc add:

Code:
source $HOME/OpenFOAM/OpenFOAM-dev/etc/bashrc
source $WM_PROJECT_DIR/bin/tools/RunFunctions
and you can start using runParallel, runApplication etc. from the terminal

or

if you want to have your own self defined function then you can start with this one and keep modifying it according to your needs in .bashrc:

Code:
runPar() { mpirun --hostfile "$1" -np "$2" "$3" -parallel > "log.$3";}
in your terminal you can invoke this function as:

Code:
runPar machines 32 simpleFoam
hope this helps!


Regards,
Ricky

#Note: The $WM_PROJECT_DIR is only recognized after sourcing OpenFOAM.

Quote:
Originally Posted by wangchengan2003 View Post
I still have a question. If we could write the command
Code:
mpirun --hostfile machines -np 32 $(getApplication) -parallel > log &
in the shell with elegant way like
Code:
runParallel $(getApplication)

__________________
If it is easy, then something is fishy!

Last edited by kera; November 13, 2017 at 07:12.
kera is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Improper data to cluster through .cas and .dat files kaeran FLUENT 0 October 24, 2014 04:10
Why not install cluster by connecting workstations together for CFD application? Anna Tian Hardware 5 July 18, 2014 14:32
Running OpenFoam on a Computer Cluster in the Cloud - cloudnumbers.com Markus Schmidberger OpenFOAM Announcements from Other Sources 0 July 26, 2011 08:18
Problem of cluster aerodynamics FLUENT 4 July 11, 2011 08:53
Computer Cluster for Turbomachinery sam FLUENT 3 September 6, 2007 14:58


All times are GMT -4. The time now is 12:45.