CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   OpenFOAM Running, Solving & CFD (http://www.cfd-online.com/Forums/openfoam-solving/)
-   -   Open Mpi on AMD 64 cluster (http://www.cfd-online.com/Forums/openfoam-solving/59175-open-mpi-amd-64-cluster.html)

nishant_hull February 4, 2008 08:13

Hi all, I am trying to run
 
Hi all,

I am trying to run interfoam simulation using mpirun on 4 processor.
I am getting following error.
I have password-free access to the all nodes and I can see the same set-up file on all computer nodes from my pc. (I do not copied it individually though!)
The error I am receiving is as follows:-
e343880@comp02:~/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam> mpirun --hostfile machines -np 4 interFoam . dam-dumy -parallel
MPI Pstream initialized with:
floatTransfer : 1
nProcsSimpleSum : 0
scheduledTransfer : 0

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.4.1 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/
[2] Date : Feb 04 2008
[2] Time : 12:00:38
[2] Host : comp00
[2] PID : 22833
[3] Date : Feb 04 2008
[3] Time : 12:00:38
[3] Host : comp00
[3] PID : 22834

Exec : interFoam . dam-dumy -parallel
[0] Date : Feb 04 2008
[0] Time : 12:00:38
[0] Host : kittyhawk
[0] PID : 30155
[1] Date : Feb 04 2008
[1] Time : 12:00:38
[1] Host : kittyhawk
[1] PID : 30156
[1] Root : /users/e343880/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam
[1] Case : dam-dumy
[1] Nprocs : 4
[1]
[1]
[1] --> FOAM FATAL IO ERROR : cannot open file
[1]
[1] file: /users/e343880/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam/dam-dumy/processor 1/system/controlDict at line 0.
[1]
[1] From function regIOobject::readStream(const word&)
[1] in file db/regIOobject/regIOobjectRead.C at line 66.
[1]
FOAM parallel run exiting
[1]
[kittyhawk:30156] MPI_ABORT invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 1
mpirun noticed that job rank 0 with PID 30155 on node kittyhawk.dcs.hull.ac.uk exited on signal 15 (Terminated).
2 additional processes aborted (not shown)

My machine file is shown below:-
\attach { machine }

Please suggest!

regards,
Nishant

nishant_hull February 4, 2008 08:14


 


nishant_hull February 4, 2008 08:15

machine file is: kittyhawk.
 
machine file is:

kittyhawk.dcs.hull.ac.uk slots=2 max-slots=2
comp00.dcs.hull.ac.uk slots=4 max-slots=4
comp01.dcs.hull.ac.uk slots=4 max-slots=4
comp02.dcs.hull.ac.uk slots=4 max-slots=4
comp03.dcs.hull.ac.uk slots=4 max-slots=4
comp04.dcs.hull.ac.uk slots=4 max-slots=4
comp05.dcs.hull.ac.uk slots=4 max-slots=4
comp06.dcs.hull.ac.uk slots=4 max-slots=4
comp07.dcs.hull.ac.uk slots=4 max-slots=4
comp08.dcs.hull.ac.uk slots=4 max-slots=4
comp09.dcs.hull.ac.uk slots=4 max-slots=4
comp10.dcs.hull.ac.uk slots=4 max-slots=4
comp11.dcs.hull.ac.uk slots=4 max-slots=4
comp12.dcs.hull.ac.uk slots=4 max-slots=4
comp13.dcs.hull.ac.uk slots=4 max-slots=4
comp14.dcs.hull.ac.uk slots=4 max-slots=4
comp15.dcs.hull.ac.uk slots=4 max-slots=4
comp16.dcs.hull.ac.uk slots=4 max-slots=4
comp17.dcs.hull.ac.uk slots=4 max-slots=4

mighelone February 4, 2008 09:22

Nishant! If you don't have a
 
Nishant!
If you don't have a shared filesystem (NFS), you have to copy the directory processor2 on the corresponding filesystem of processor2 node, and so on.

Michele

nishant_hull February 4, 2008 09:47

Thanks for the reply Michele,
 
Thanks for the reply Michele,
Well, whenever i ssh to other nodes of the cluster, I can see the same file what I see on master node. If NFS mean that then I guess, I have NFS system here.
But I would lik to know where I went wrong??

Nishant

nishant_hull February 4, 2008 13:16

After some modification I am g
 
After some modification I am getting this error:-

mpirun --hostfile machines -np 4 interFoam . dam-dumy -parallel
MPI Pstream initialized with:
[2] Date : Feb 04 2008
[2] Time : 17:13:39
[2] Host : comp00
[2] PID : 25855
floatTransfer : 1
nProcsSimpleSum : 0
scheduledTransfer : 0

/*---------------------------------------------------------------------------*\
| ========= | |
| \ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \ / O peration | Version: 1.4.1 |
| \ / A nd | Web: http://www.openfoam.org |
| \/ M anipulation | |
\*---------------------------------------------------------------------------*/

Exec : interFoam . dam-dumy -parallel
[0] Date : Feb 04 2008
[0] Time : 17:13:39
[0] Host : kittyhawk
[0] PID : 28172
[0] Root : /users/e343880/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam
[0] Case : dam-dumy
[0] Nprocs : 4
[0] Slaves :
[0] 3
[0] (
[0] kittyhawk.28173
[0] comp00.25855
[0] comp00.25856
[0] )
[0]
[3] Date : Feb 04 2008
[3] Time : 17:13:39
[0]
[0]
[0] --> FOAM FATAL ERROR : interFoam: Cannot open case directory "/users/e343880/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam/dam-dumy/processo r0"
[0]
[0]
FOAM parallel run exiting
[0]
[kittyhawk:28172] MPI_ABORT invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 1
[3] Host : comp00
[1] Date : Feb 04 2008
[1] Time : 17:13:39
[1] Host : kittyhawk
[1] PID : 28173
[1] Root : /users/e343880/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam
[1] Case : dam-dumy
[1] Nprocs : 4
[3] PID : 25856
[1]
[1]
[1] --> FOAM FATAL IO ERROR : cannot open file
[1]
[1] file: /users/e343880/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam/dam-dumy/processor 1/system/controlDict at line 0.
[1]
[1] From function regIOobject::readStream(const word&)
[1] in file db/regIOobject/regIOobjectRead.C at line 66.
[1]
FOAM parallel run exiting
[1]
[kittyhawk:28173] MPI_ABORT invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 1
[2] Root : /users/e343880/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam
[2] Case : dam-dumy
[2] Nprocs : 4
[3] Root : /users/e343880/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam
[3] Case : dam-dumy
[3] Nprocs : 4
[2]
[2]
[2] --> FOAM FATAL IO ERROR : cannot open file
[2]
[2] file: /users/e343880/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam/dam-dumy/processor 2/system/controlDict at line 0.
[2]
[2] From function regIOobject::readStream(const word&)
[2] in file db/regIOobject/regIOobjectRead.C at line 66.
[2]
FOAM parallel run exiting
[2]
[comp00:25855] MPI_ABORT invoked on rank 2 in communicator MPI_COMM_WORLD with errorcode 1
[3]
[3]
[3] --> FOAM FATAL IO ERROR : cannot open file
[3]
[3] file: /users/e343880/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam/dam-dumy/processor 3/system/controlDict at line 0.
[3]
[3] From function regIOobject::readStream(const word&)
[3] in file db/regIOobject/regIOobjectRead.C at line 66.
[3]
FOAM parallel run exiting
[3]
[comp00:25856] MPI_ABORT invoked on rank 3 in communicator MPI_COMM_WORLD with errorcode 1
mpirun noticed that job rank 1 with PID 28173 on node kittyhawk.dcs.hull.ac.uk exited on signal 1 (Hangup).


Please suggest!


regards..

Nishant

andrewburns February 4, 2008 19:20

I too have just started playin
 
I too have just started playing with parallel and I'm getting an error very similar to this (just for my case)

[1] file: /users/e343880/OpenFOAM/e343880-1.4.1/run/tutorials/interFoam/dam-dumy/processor 1/system/controlDict at line 0.

I've looked and none of the processor folders in the case folder have system/controlDict even though decomposePar ran correctly and I believe is set up correctly.


All times are GMT -4. The time now is 16:09.