CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   OpenFOAM Installation (http://www.cfd-online.com/Forums/openfoam-installation/)
-   -   MPI error (http://www.cfd-online.com/Forums/openfoam-installation/93173-mpi-error.html)

florencenawei October 6, 2011 22:02

MPI error
 
Fomers,i compiled OpenFoam on RHEL 5.5,but when i parallel the case ,it shows the error below.i run the same case in my pc with 4 cpus,everything is ok.waiting for your help,i spend almost one month on it.


this is my error message

mgmt.18168ipath_wait_for_device: The /dev/ipath device failed to appear after 30.0 seconds: Connection timed out
mgmt.18168PSM Could not find an InfiniPath Unit on device /dev/ipath (30s elapsed) (err=21)
--------------------------------------------------------------------------
PSM was unable to open an endpoint. Please make sure that the network link is
active on the node and the hardware is functioning.

Error: PSM Could not find an InfiniPath Unit
--------------------------------------------------------------------------
mgmt.18165ipath_wait_for_device: The /dev/ipath device failed to appear after 30.0 seconds: Connection timed out
mgmt.18165PSM Could not find an InfiniPath Unit on device /dev/ipath (30s elapsed) (err=21)
mgmt.18166ipath_wait_for_device: The /dev/ipath device failed to appear after 30.0 seconds: Connection timed out
mgmt.18166PSM Could not find an InfiniPath Unit on device /dev/ipath (30s elapsed) (err=21)
mgmt.18167ipath_wait_for_device: The /dev/ipath device failed to appear after 30.0 seconds: Connection timed out
mgmt.18167PSM Could not find an InfiniPath Unit on device /dev/ipath (30s elapsed) (err=21)
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

PML add procs failed
--> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[mgmt:18168] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[mgmt:18165] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[mgmt:18166] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[mgmt:18167] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 18166 on
node mgmt exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[mgmt:18164] 7 more processes have sent help message help-mtl-psm.txt / unable to open endpoint
[mgmt:18164] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[mgmt:18164] 3 more processes have sent help message help-mpi-runtime / mpi_init:startup:internal-failure

thanks ,florence

wyldckat October 9, 2011 07:37

Greetings Florence,

This is very little information to work with. So, instead of asking you for more information about your system, I'm just going to try and point you in the right direction:
  1. Check the following posts for ideas to help you to figure out what the problem really is: http://www.cfd-online.com/Forums/ope...tml#post326351 posts #8 and #10.
  2. Post #21 on the thread decomposed case to 2-cores (Not working) shows how to avoid using certain network connections, which might be useful for your case as well.
  3. For even more information about running OpenFOAM in parallel, here is a blog post where I'm keeping Notes about running OpenFOAM in parallel
Best regards and good luck!
Bruno

florencenawei October 9, 2011 22:46

Quote:

Originally Posted by wyldckat (Post 327201)
Greetings Florence,

This is very little information to work with. So, instead of asking you for more information about your system, I'm just going to try and point you in the right direction:
  1. Check the following posts for ideas to help you to figure out what the problem really is: http://www.cfd-online.com/Forums/ope...tml#post326351 posts #8 and #10.
  2. Post #21 on the thread decomposed case to 2-cores (Not working) shows how to avoid using certain network connections, which might be useful for your case as well.
  3. For even more information about running OpenFOAM in parallel, here is a blog post where I'm keeping Notes about running OpenFOAM in parallel
Best regards and good luck!
Bruno

wyldcakt ,thanks very much ,actually,i know you are a expert on openfoam parallel ,i have seen a lot of posts of you on forums.i am really a new foamer. i spent almost a month on compile openfoam on RHEL. thanks

florencenawei October 10, 2011 01:21

Quote:

Originally Posted by wyldckat (Post 327201)
Greetings Florence,

This is very little information to work with. So, instead of asking you for more information about your system, I'm just going to try and point you in the right direction:
  1. Check the following posts for ideas to help you to figure out what the problem really is: http://www.cfd-online.com/Forums/ope...tml#post326351 posts #8 and #10.
  2. Post #21 on the thread decomposed case to 2-cores (Not working) shows how to avoid using certain network connections, which might be useful for your case as well.
  3. For even more information about running OpenFOAM in parallel, here is a blog post where I'm keeping Notes about running OpenFOAM in parallel
Best regards and good luck!
Bruno

thanks ,Bruno,the second item works!thanks very much . through i am not an natural english user,i wish i could get your email address.


All times are GMT -4. The time now is 07:30.