|
[Sponsors] |
June 5, 2009, 04:13 |
Parallel & hostfile
|
#1 |
Super Moderator
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 3,297
Rep Power: 41 |
hello,
I am trying to set a parallel calculation, but experiencing one issue I have 2 pc connected via ethernet (master with 192.168.0.1 & node with 192.168.0.14) ssh runs without need to enter password from master--> "ssh node ls" is ok from node--> "ssh node ls" is ok from master i export /home/user/OpenFOAM from node I mount it without problem. I checked the helloworld example successfully OF is on master installed (under /home/user), and it runs successfully in serial mode. I can decompose one model into 2 subdomains without problem I created the "machines" file as described in the doc, and from here I get trouble. "machines" looks like 192.168.0.1 192.168.0.14 If I run my model in parallel >mpirun --hostfile machines simpleFoam -parallel > log & I get the message error "connect() failed with errno=113 Now, in machines, if I switch the IP-address order like 192.168.0.14 192.168.0.1 it runs..... (the log shows that the host is node and the slave is master) any idea? Thanks a lot in advance
__________________
In memory of my friend Hervé: CFD engineer & freerider |
|
June 18, 2009, 16:54 |
|
#2 |
New Member
Klaus Rädecke
Join Date: Jun 2009
Location: Rüsselsheim, Germany
Posts: 9
Rep Power: 16 |
Hello, I have the similar problem using OpenMPI for the lesCavitatingFoam tutorial.
I have two different machines for OpenFOAM 1.5: 1. foam-8: Suse 10.3 64bit 4GB gcc 4.2.1, OpenFoam installed from binary dp64 distribution 2. foam-9: Ubuntu Studio 8.04 4GB 32bit gcc 4.3.1, OpenFoam installed from binary dp distribution Both installations pass the foamInstallationTest (foam-8 has gcc issue, never mind?). Maybe you check this too, -mAx-? For both machines, it is possible to issue ssh commands for both machines without entering a password. /home/rae/OpenFOAM/rae-1.5/ is a nfs share provided by foam-9 to foam-8 Running : mpirun --hostfile system/machines -np 4 lesCavitatingFoam -case /home/rae/OpenFOAM/rae-1.5/tutorials/lesCavitatingFoam/throttle3D -parallel depending on the machines file, gives following results: 1. system/machines contains the submitting machine name only: 4 Processes run on 1 host (2 cores) successfully 2. system/machines contains: foam-8 cpu=2 foam-9 cpu=2 where foam-8 is the submitting host then: orted starts up immediately on both hosts. After very long time, on both machines two processes lesCavitationFoam execute, but no CPU load, and finally comes the error report: /*---------------------------------------------------------------------------*\ | ========= | | | \\ / F ield | OpenFOAM: The Open Source CFD Toolbox | | \\ / O peration | Version: 1.5 | | \\ / A nd | Web: http://www.OpenFOAM.org | | \\/ M anipulation | | \*---------------------------------------------------------------------------*/ Exec : lesCavitatingFoam -case /home/rae/OpenFOAM/rae-1.5/tutorials/lesCavitatingFoam/throttle3D -parallel Date : Jun 18 2009 Time : 18:27:33 Host : foam-8 PID : 11518 [1] [1] [1] Expected a ')' or a '}' while reading List, found on line 0 the word 'o' [1] [1] file: IOstream at line [3] 0. [1] [1] From function Istream::readEndList(const char*) [1] in file db/IOstreams/IOstreams/Istream.C [3] [3] at line 159. [1] FOAM parallel run exiting [1] Expected a ')' or a '}' while reading List, found on line 0 the word 'o' [3] [foam-9:10614] MPI_ABORT invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 1 [3] file: IOstream at line 0. [3] [3] From function Istream::readEndList(const char*) [3] in file db/IOstreams/IOstreams/Istream.C at line 159. [3] FOAM parallel run exiting [3] [foam-9:10615] MPI_ABORT invoked on rank 3 in communicator MPI_COMM_WORLD with errorcode 1 mpirun noticed that job rank 0 with PID 11518 on node foam-8 exited on signal 15 (Terminated). 1 additional process aborted (not shown) --------- If I omit the "-parallel", 4 processes run as expected, but they run all the same stuff I guess. Thus, mpirun does its job correctly? Does this description fit your experience? Any Ideas? Thanks |
|
June 19, 2009, 01:43 |
|
#3 |
Super Moderator
Maxime Perelli
Join Date: Mar 2009
Location: Switzerland
Posts: 3,297
Rep Power: 41 |
my problem is "solved".
I don't know why I had this problem, but now it runs perfectly. Test are running under 8 machines without problem. For info, I don't install OF on each machine, I share the OF-installation with NFS. So as you mentionned that both machines succeded the foamInstallationTest, then you may do it only for the master (as you use NFS too)
__________________
In memory of my friend Hervé: CFD engineer & freerider |
|
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Script to Run Parallel Jobs in Rocks Cluster | asaha | OpenFOAM Running, Solving & CFD | 12 | July 4, 2012 23:51 |
Performance of GGI case in parallel | hannes | OpenFOAM Running, Solving & CFD | 26 | August 3, 2011 04:07 |
HP MPI warning...Distributed parallel processing | Peter | CFX | 10 | May 14, 2011 07:17 |
IcoFoam parallel woes | msrinath80 | OpenFOAM Running, Solving & CFD | 9 | July 22, 2007 03:58 |
Parallel Computing Classes at San Diego Supercomputer Center Jan. 20-22 | Amitava Majumdar | Main CFD Forum | 0 | January 5, 1999 13:00 |