Problem running OpenFOAM 2.2.x in parallel in Centos 5
Can you please help me with this problem?
I am trying to run OpenFOAM 2.2.x in parallel. I am working in a Centos 5 supercomputer. I am trying to run a very simple case from the tutorials called damBreak in parallel. And when I type: mpirun -np 4 interFoam -parallel The following error pop up: It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_shmem_base_select failed --> Returned value -1 instead of OPAL_SUCCESS -------------------------------------------------------------------------- [saguaro1.local:22583] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 79 [saguaro1.local:22583] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orterun.c at line 694 |
Hi try with these extra input to mpirun
Also check that its using the openfoam mpirun and not the system mpirun. You can check this by sourcing the OF environment and then do Code:
which mpirun Code:
mpirun -np 4 -x LD_LIBRARY_PATH -x PATH -x WM_PROJECT_DIR -x WM_PROJECT_INST_DIR -x WM_OPTIONS -x FOAM_LIBBIN -x FOAM_APPBIN -x FOAM_USER_APPBIN -x MPI_BUFFER_SIZE interFoam -parallel |
Hi Linnemann,
Thank you for your prompt response. Indeed it is using open mpirun. Code:
which mpirun ~/OpenFOAM/ThirdParty-2.2.x/platforms/linux64Gcc/openmpi-1.6.3/bin/mpirun But, I am still getting the same error when I do: Code:
mpirun -np 4 -x LD_LIBRARY_PATH -x PATH -x WM_PROJECT_DIR -x WM_PROJECT_INST_DIR -x WM_OPTIONS -x FOAM_LIBBIN -x FOAM_APPBIN -x FOAM_USER_APPBIN -x MPI_BUFFER_SIZE interFoam -parallel -------------------------------------------------------------------------- It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_shmem_base_select failed --> Returned value -1 instead of OPAL_SUCCESS -------------------------------------------------------------------------- Thank you. Laura |
I personally think that this is a bug in the centfoam distribution rather than something specific to the supercomputer environment.
|
Greetings to all!
@Laura: Quote:
AFAIK, there are two possible solutions that should give the best results:
Bruno |
Hi
I installed a virtual machine and found the same problem with mpirun. To fix it you need to have the system gcc and gcc-c++ installed. Otherwise you will get a compile error. So here comes the fix Code:
rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3 I will have a centFoam version up with the fixes, but it will take some time to upload etc. |
Linnemann,
Thank you very much for your help. I follow your instructions and it worked for me as well. Best. Laura |
It seems I'm experiencing the same problem.
I have updated from 2.1.x to 2.2.x (git repository / Third Party 2.2.0) on Debian Wheezy in a small 3 node cluster on thze master node. While so far everything works on the master node with its 8 cores in parallel, I get the above described error as soon as a i define the other nodes in the hostfile. I hoped that I get it fixed quickly by following the instructions of linnemann, but this was unfortunately not the case in my situation. The same error occured again. The installation is only on the central master node, all paths are defined correctly to access openFoam executables on the master node. This worked fine on the 2.1.x and previous installations. I have checked the paths configuration for openmpi and get by "which mpirun" from all nodes the same correct path of the 2.2.x installation. I also checked and confirmed that I can start from each node OF executables on the master node. Any suggestions and help is higly welcome!!! Best Regards, Peter |
Greetings to all!
@Peter: I guess you ignored post #5 :rolleyes: More specifically, the second bullet point. I say this because from your description, it looks like you've been using the Open-MPI versions that are distributed with OpenFOAM. Problem is that OpenFOAM 2.2 provides Open-MPI 1.6.3, which might act/build in a different way from the older versions. Therefore, if you use the Debian's own Open-MPI version, you shouldn't have any more problems. Since you're using Debian Wheezy, I think you can take a look at these instructions that are directed towards installing OpenFOAM 2.2.0 on Ubuntu 12.04: http://openfoamwiki.net/index.php/In...u#Ubuntu_12.04 - more specifically steps #1, #3 and #4... Best regards, Bruno |
Quote:
Many thanks for your help! I just want to let you know that this fix worked for me as well (centFOAM on centOS 6.4). Cheers, |
@ Bruno,
thank`s a lot for your suggestion. Obviously Debian Wheezy has problems with openmpi 1.6.3 or viceversa. Following your proposal, to use Wheezy`s own openmpi (1.4.5), fixed the problem. I changed to: "export WM_MPLIB= SYSTEMOPENMPI" in "OpenFOAM-2.2.x/etc/bashrc", recompiled and after that it worked. Best Regards, Peter:) |
New versions of centFoam is now up that fixes the problem, but also has all the newest git pull from the 01-08-2013.
|
Quote:
|
I have seen my friend using OpenFOAM with few nodes in cluster, with one node as master and other nodes as slaves. (Rocks & CentOS)
-- KANNAN |
Problem with Centfoam
@linnemann
Hello linnemann I am facing the same problem with the latest download of CentFoam 6.x which as pre source forge is updated on 20-08-2013. But the problem persists. I also noticed that now the version of MPI is 1.7.2 rather than 1.6.3 for which the method to solve the problem is defined. Regards Aditya |
Hi
Just replace 1.6.3 with 1.7.2 in the how to get it to work. I still dont know why this error happens after it gets unpacked on a different machine. |
I've got same problem with manual installation of centFOAM OpenFOAM-2.3.x for CentOS 6
The solution presented by linnemann didn't work for me. I'll try the python installation with centFOAM.py --OF22 and see if it works |
Hi!
I found somewhere the command: unset LD_PRELOAD which solved the problem for me. |
Hello,
I'm trying to run OpenFoam 2.3.0 in parallel on CentOS 6.2 and I have the same problem described in the original post. I have already tried all the above proposed solutions like changing the build options to use the system's openmpi and recompiling, however the following message keeps coming up: It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_shmem_base_select failed --> Returned value -1 instead of OPAL_SUCCESS Everything works fine in serial simulation. I'd like to ask if someone who had the same problem found another workaround and managed to run OpenFoam 2.3 in parallel. Thanks alot in advance, Greg |
Greetings Greg and welcome to the forum!
If you could provide more information on how you've proceeded to perform the installation of OpenFOAM 2.3.0 on your machine, it would help to start diagnosing why you're getting that error message. In addition, have a look at the following thread for more diagnostic steps: http://www.cfd-online.com/Forums/ope...-2-centos.html Best regards, Bruno |
Hello Bruno,
Sorry for the late reply. Actually the link that you provided solved the problem for my case. More specific I followed the instructions in the last post of the thread and then the instructions in the second post of the following thread: http://www.cfd-online.com/Forums/ope...e-openmpi.html Now I can run OpenFoam 2.3.0 in parallel, however I can not use decomposePar. So for now I use decomposePar from OpenFoam 2.1.1 Thanks a lot for the help! Best regards, Greg |
Hi guys!
For me the tip from linnemann Quote:
Surprisly the solver runs well on the login node of the RHEL Server in parallel mode, but when I start it via the queueing system I get the following error message in a log file: Quote:
Quote:
So, would you be so kind to make your suggestions, what is the reason and how can I fix this issue? Cheers, Alex |
I met the same problem before. And it is solved by recompiling the ThirdParty (Allwclean, and then Allwmake).
|
Hi Alexander,
Did you find the solution about your issue? I have the same problem and i tried to resolve by the manner proposed by linnemann but i didn't succeed I need help please, Thanks |
Dear stater,
I've just solved a problem in my case. It was: 1. Rebuilding the ThirParty with system gcc 2. Running my tasks with the mpirun options: Quote:
|
I also successfully installed 2.3.x git version on CentOS 5.9 just following instructions:
https://openfoamwiki.net/index.php/I...CentOS_SL_RHEL Thank you, Bruno! ))) |
Quote:
|
Quote:
|
Quote:
http://www.cfd-online.com/Forums/ope...tml#post599384. I pasted some information in it. Thank you for your help, best regards, Jiaojiao |
Quote:
|
Same problem appears in OF1606+
Dear all,
I have the same problem with the new OF1606+ I tried which mpirun and it is the one included with the OF1606+ I can't wclean and allwmake the mpi becuase of permissions under docker containers Any ideas please |
Quote:
Hi Foamers, I just wanted to highlight that the above solution is still valid. I had the same problem with foam-extend 4.0, I think due to have more than one version of mpi in my machine. I've changed the WM_MPLIB for SYSTEMOPENMPI then recompiled foam and it works! |
Quote:
I use foam-extend-4.0. There is no 'PStream' directory under $FOAM_SRC in order to make the changes that you've mentioned.:cool: |
Hi linnemann,
I faced with exactly the same problem on Ubuntu 20.04 with HELYX-OS 2.4 and OpenFoam 4.1. I found your solutions which was worked for others as well, however in case of my setup there are no directories as you mentioned hence I am not able to run the suggested commands. Do you have any idea how should I modify your code? Thanks in advance, Feri |
All times are GMT -4. The time now is 20:06. |