CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Running, Solving & CFD (https://www.cfd-online.com/Forums/openfoam-solving/)
-   -   Problem running OpenFOAM 2.2.x in parallel in Centos 5 (https://www.cfd-online.com/Forums/openfoam-solving/117893-problem-running-openfoam-2-2-x-parallel-centos-5-a.html)

lvalvare May 16, 2013 18:36

Problem running OpenFOAM 2.2.x in parallel in Centos 5
 
Can you please help me with this problem?

I am trying to run OpenFOAM 2.2.x in parallel. I am working in a Centos 5 supercomputer.

I am trying to run a very simple case from the tutorials called damBreak in parallel. And when I type:

mpirun -np 4 interFoam -parallel

The following error pop up:

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[saguaro1.local:22583] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 79
[saguaro1.local:22583] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orterun.c at line 694

linnemann May 17, 2013 15:43

Hi try with these extra input to mpirun

Also check that its using the openfoam mpirun and not the system mpirun.

You can check this by sourcing the OF environment and then do

Code:

which mpirun
that should ouput a path which should contain "Thirdparty-2.2.x" in it.

Code:

mpirun -np 4 -x LD_LIBRARY_PATH -x PATH -x WM_PROJECT_DIR -x WM_PROJECT_INST_DIR -x WM_OPTIONS -x FOAM_LIBBIN -x FOAM_APPBIN -x FOAM_USER_APPBIN -x MPI_BUFFER_SIZE interFoam -parallel

lvalvare May 17, 2013 16:55

Hi Linnemann,

Thank you for your prompt response.

Indeed it is using open mpirun.

Code:

which mpirun
This is the output:

~/OpenFOAM/ThirdParty-2.2.x/platforms/linux64Gcc/openmpi-1.6.3/bin/mpirun

But, I am still getting the same error when I do:

Code:

mpirun -np 4 -x LD_LIBRARY_PATH -x PATH -x WM_PROJECT_DIR -x WM_PROJECT_INST_DIR -x WM_OPTIONS -x FOAM_LIBBIN -x FOAM_APPBIN -x FOAM_USER_APPBIN -x MPI_BUFFER_SIZE interFoam -parallel
Error displays:

--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------

Thank you.

Laura

lvalvare May 17, 2013 17:13

I personally think that this is a bug in the centfoam distribution rather than something specific to the supercomputer environment.

wyldckat May 17, 2013 17:59

Greetings to all!

@Laura:
Quote:

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
This message seems to indicate that it's not able to access the shared memory capabilities. From what I found online, this can happen when the wrong build options are chosen.

AFAIK, there are two possible solutions that should give the best results:
  1. You can try to disable using all communications, except to use a specific one, possibly Ethernet or Infiniband... I think this can be achieved with something like:
    Code:

    mpirun --mca btl tcp -np 4 interFoam -parallel
    Or for IB:
    Code:

    mpirun --mca btl openib -np 4 interFoam -parallel
    For more information, read post #8 from here: http://www.cfd-online.com/Forums/ope...tml#post251468
  2. Or you should check which is the supercomputer's own MPI toolbox that should be used and configure OpenFOAM's Pstream library to be built with that MPI toolbox. For this, have a look into OpenFOAM's "etc/bashrc" file and search for the entry "WM_MPLIB", adjust accordingly and source the file again... or simply start a new terminal :)
    If by any chance it's not a specific MPI toolbox from the ones exemplified in the "bashrc" file... Then see http://www.cfd-online.com/Forums/ope...tml#post340456 - see post #2
Best regards,
Bruno

linnemann May 18, 2013 15:47

Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:

rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake

After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.

lvalvare May 20, 2013 13:54

Linnemann,

Thank you very much for your help. I follow your instructions and it worked for me as well.

Best.

Laura

PeterX30 May 20, 2013 17:50

It seems I'm experiencing the same problem.

I have updated from 2.1.x to 2.2.x (git repository / Third Party 2.2.0)
on Debian Wheezy in a small 3 node cluster on thze master node.
While so far everything works on the master node with its 8 cores in parallel, I get the above described error as soon as a i define the other nodes in the hostfile. I hoped that I get it fixed quickly by following the instructions of linnemann, but this was unfortunately not the case in my situation. The same error occured again.

The installation is only on the central master node, all paths are defined correctly to access openFoam executables on the master node. This worked fine on the 2.1.x and previous installations. I have checked the paths configuration for openmpi and get by "which mpirun" from all nodes the same correct path of the 2.2.x installation. I also checked and confirmed that I can start from each node OF executables on the master node.

Any suggestions and help is higly welcome!!!

Best Regards,
Peter

wyldckat May 21, 2013 17:57

Greetings to all!

@Peter: I guess you ignored post #5 :rolleyes: More specifically, the second bullet point.
I say this because from your description, it looks like you've been using the Open-MPI versions that are distributed with OpenFOAM. Problem is that OpenFOAM 2.2 provides Open-MPI 1.6.3, which might act/build in a different way from the older versions.

Therefore, if you use the Debian's own Open-MPI version, you shouldn't have any more problems. Since you're using Debian Wheezy, I think you can take a look at these instructions that are directed towards installing OpenFOAM 2.2.0 on Ubuntu 12.04: http://openfoamwiki.net/index.php/In...u#Ubuntu_12.04 - more specifically steps #1, #3 and #4...

Best regards,
Bruno

Antons May 23, 2013 08:49

Quote:

Originally Posted by linnemann (Post 428473)
Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:

rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake

After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.

Hi everyone!
Many thanks for your help! I just want to let you know that this fix worked for me as well (centFOAM on centOS 6.4).
Cheers,

PeterX30 May 25, 2013 11:24

@ Bruno,

thank`s a lot for your suggestion. Obviously Debian Wheezy has problems with openmpi 1.6.3 or viceversa. Following your proposal, to use Wheezy`s own openmpi (1.4.5), fixed the problem. I changed to:

"export WM_MPLIB= SYSTEMOPENMPI"

in "OpenFOAM-2.2.x/etc/bashrc", recompiled and after that it worked.

Best Regards,
Peter:)

linnemann August 1, 2013 11:35

New versions of centFoam is now up that fixes the problem, but also has all the newest git pull from the 01-08-2013.

r_gordon August 30, 2013 10:06

Quote:

Originally Posted by linnemann (Post 428473)
Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:

rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake

After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.

This worked a treat :). Thanks for the help.

skyinventorbt October 9, 2013 00:01

I have seen my friend using OpenFOAM with few nodes in cluster, with one node as master and other nodes as slaves. (Rocks & CentOS)
--
KANNAN

kingmaker October 17, 2013 08:57

Problem with Centfoam
 
@linnemann

Hello linnemann

I am facing the same problem with the latest download of CentFoam 6.x which as pre source forge is updated on 20-08-2013. But the problem persists.

I also noticed that now the version of MPI is 1.7.2 rather than 1.6.3 for which the method to solve the problem is defined.

Regards
Aditya

linnemann October 18, 2013 04:54

Hi

Just replace 1.6.3 with 1.7.2 in the how to get it to work.

I still dont know why this error happens after it gets unpacked on a different machine.

venturi March 11, 2014 08:48

I've got same problem with manual installation of centFOAM OpenFOAM-2.3.x for CentOS 6

The solution presented by linnemann didn't work for me.

I'll try the python installation with centFOAM.py --OF22 and see if it works

ma-tri-x May 9, 2014 10:23

Hi!

I found somewhere the command:
unset LD_PRELOAD

which solved the problem for me.

greg.cfd June 11, 2014 02:39

Hello,

I'm trying to run OpenFoam 2.3.0 in parallel on CentOS 6.2 and I have the same problem described in the original post.

I have already tried all the above proposed solutions like changing the build options to use the system's openmpi and recompiling, however the following message keeps coming up:

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS

Everything works fine in serial simulation.

I'd like to ask if someone who had the same problem found another workaround and managed to run OpenFoam 2.3 in parallel.

Thanks alot in advance,
Greg

wyldckat June 28, 2014 15:28

Greetings Greg and welcome to the forum!

If you could provide more information on how you've proceeded to perform the installation of OpenFOAM 2.3.0 on your machine, it would help to start diagnosing why you're getting that error message.

In addition, have a look at the following thread for more diagnostic steps: http://www.cfd-online.com/Forums/ope...-2-centos.html

Best regards,
Bruno

greg.cfd July 4, 2014 02:32

Hello Bruno,

Sorry for the late reply. Actually the link that you provided solved the problem for my case.

More specific I followed the instructions in the last post of the thread and then the instructions in the second post of the following thread:

http://www.cfd-online.com/Forums/ope...e-openmpi.html

Now I can run OpenFoam 2.3.0 in parallel, however I can not use decomposePar. So for now I use decomposePar from OpenFoam 2.1.1

Thanks a lot for the help!

Best regards,
Greg

makaveli_lcf September 9, 2014 09:37

Hi guys!

For me the tip from linnemann

Quote:

Originally Posted by linnemann (Post 428473)
Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:

rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake

After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.

worked well to:
  1. build and run OF2.2.x on CentOS 5.9 (final)
  2. build but NOT TO RUN OF2.3.x on RHEL Server 6.4 cluster

Surprisly the solver runs well on the login node of the RHEL Server in parallel mode, but when I start it via the queueing system I get the following error message in a log file:

Quote:

interFoam: relocation error: /home/avakhrushev/OpenFOAM/ThirdParty-2.3.x/platforms/linux64Gcc/openmpi-1.6.5/lib64/openmpi/mca_btl_openib.so: symbol rdma_get_src_port, version RDMACM_1.0 not defined in file librdmacm.so.1 with link time reference
interFoam: relocation error: /home/avakhrushev/OpenFOAM/ThirdParty-2.3.x/platforms/linux64Gcc/openmpi-1.6.5/lib64/openmpi/mca_btl_openib.so: symbol rdma_get_src_port, version RDMACM_1.0 not defined in file librdmacm.so.1 with link time reference
interFoam: relocation error: /home/avakhrushev/OpenFOAM/ThirdParty-2.3.x/platforms/linux64Gcc/openmpi-1.6.5/lib64/openmpi/mca_btl_openib.so: symbol rdma_get_src_port, version RDMACM_1.0 not defined in file librdmacm.so.1 with link time reference
interFoam: relocation error: /home/avakhrushev/OpenFOAM/ThirdParty-2.3.x/platforms/linux64Gcc/openmpi-1.6.5/lib64/openmpi/mca_btl_openib.so: symbol rdma_get_src_port, version RDMACM_1.0 not defined in file librdmacm.so.1 with link time reference
interFoam: relocation error: /home/avakhrushev/OpenFOAM/ThirdParty-2.3.x/platforms/linux64Gcc/openmpi-1.6.5/lib64/openmpi/mca_btl_openib.so: symbol rdma_get_src_port, version RDMACM_1.0 not defined in file librdmacm.so.1 with link time reference
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 120795 on
node n009 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
I also tried the options suggested by the wyldckat

Quote:

mpirun --mca btl tcp -np 12 interFoam -parallel

mpirun --mca btl openib -np 12 interFoam -parallel
but that does not help.

So, would you be so kind to make your suggestions, what is the reason and how can I fix this issue?

Cheers,
Alex

yejungong September 29, 2014 02:45

I met the same problem before. And it is solved by recompiling the ThirdParty (Allwclean, and then Allwmake).

stater November 24, 2014 14:03

Hi Alexander,

Did you find the solution about your issue? I have the same problem and i tried to resolve by the manner proposed by linnemann but i didn't succeed
I need help please,

Thanks

makaveli_lcf April 21, 2015 04:01

Dear stater,

I've just solved a problem in my case. It was:

1. Rebuilding the ThirParty with system gcc

2. Running my tasks with the mpirun options:
Quote:

mpirun --mca btl ^openib --mca btl_tcp_if_include eth0 -np 12 interFoam -parallel

makaveli_lcf April 21, 2015 07:56

I also successfully installed 2.3.x git version on CentOS 5.9 just following instructions:

https://openfoamwiki.net/index.php/I...CentOS_SL_RHEL

Thank you, Bruno! )))

jiaojiao May 6, 2016 04:40

Quote:

Originally Posted by linnemann (Post 428473)
Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:

rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake

After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.

I follow the steps to fix the same problem,but I can't finish the last Allwmake process. It shows cannot find the mpi.h file.I don't kown what to do.Could you give me some ideas?Thanks!

wyldckat May 8, 2016 15:04

Quote:

Originally Posted by jiaojiao (Post 598950)
I follow the steps to fix the same problem,but I can't finish the last Allwmake process. It shows cannot find the mpi.h file.I don't kown what to do.Could you give me some ideas?Thanks!

Quick answer: Please provide more specific details. For example:
  1. Which exact Linux Distribution and version are you using?
  2. Which OpenFOAM version are you trying to install?
  3. Which installation instructions did you follow for installing OpenFOAM?

jiaojiao May 10, 2016 01:47

Quote:

Originally Posted by wyldckat (Post 599201)
Quick answer: Please provide more specific details. For example:
  1. Which exact Linux Distribution and version are you using?
  2. Which OpenFOAM version are you trying to install?
  3. Which installation instructions did you follow for installing OpenFOAM?

Thank you for your quick answer, Bruno. The version of OpenFOAM I am trying to install is 2.2.2 with Red Hat 4.8.I post a new thread,
http://www.cfd-online.com/Forums/ope...tml#post599384. I pasted some information in it.
Thank you for your help,
best regards,
Jiaojiao

philocfd June 27, 2016 11:17

Quote:

Originally Posted by linnemann (Post 428473)
Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:

rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake

After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.

This worked well and fixed my problem, Thank you!

kebsiali July 2, 2016 13:00

Same problem appears in OF1606+
 
Dear all,

I have the same problem with the new OF1606+
I tried which mpirun and it is the one included with the OF1606+
I can't wclean and allwmake the mpi becuase of permissions under docker containers

Any ideas please

Saleh Abuhanieh October 12, 2018 00:24

Quote:

Originally Posted by PeterX30 (Post 429988)
@ Bruno,

thank`s a lot for your suggestion. Obviously Debian Wheezy has problems with openmpi 1.6.3 or viceversa. Following your proposal, to use Wheezy`s own openmpi (1.4.5), fixed the problem. I changed to:

"export WM_MPLIB= SYSTEMOPENMPI"

in "OpenFOAM-2.2.x/etc/bashrc", recompiled and after that it worked.

Best Regards,
Peter:)




Hi Foamers,


I just wanted to highlight that the above solution is still valid.
I had the same problem with foam-extend 4.0, I think due to have more than one version of mpi in my machine.
I've changed the WM_MPLIB for SYSTEMOPENMPI
then recompiled foam
and it works!

amuzeshi September 21, 2019 04:37

Quote:

Originally Posted by linnemann (Post 428473)
Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:

rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake

After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.

Hi,
I use foam-extend-4.0. There is no 'PStream' directory under $FOAM_SRC in order to make the changes that you've mentioned.:cool:

safranyikf December 9, 2020 08:44

Hi linnemann,


I faced with exactly the same problem on Ubuntu 20.04 with HELYX-OS 2.4 and OpenFoam 4.1.
I found your solutions which was worked for others as well, however in case of my setup there are no directories as you mentioned hence I am not able to run the suggested commands. Do you have any idea how should I modify your code?


Thanks in advance, Feri


All times are GMT -4. The time now is 20:06.