CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   OpenFOAM Running, Solving & CFD (http://www.cfd-online.com/Forums/openfoam-solving/)
-   -   Problem running OpenFOAM 2.2.x in parallel in Centos 5 (http://www.cfd-online.com/Forums/openfoam-solving/117893-problem-running-openfoam-2-2-x-parallel-centos-5-a.html)

lvalvare May 16, 2013 18:36

Problem running OpenFOAM 2.2.x in parallel in Centos 5
 
Can you please help me with this problem?

I am trying to run OpenFOAM 2.2.x in parallel. I am working in a Centos 5 supercomputer.

I am trying to run a very simple case from the tutorials called damBreak in parallel. And when I type:

mpirun -np 4 interFoam -parallel

The following error pop up:

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[saguaro1.local:22583] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 79
[saguaro1.local:22583] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orterun.c at line 694

linnemann May 17, 2013 15:43

Hi try with these extra input to mpirun

Also check that its using the openfoam mpirun and not the system mpirun.

You can check this by sourcing the OF environment and then do

Code:

which mpirun
that should ouput a path which should contain "Thirdparty-2.2.x" in it.

Code:

mpirun -np 4 -x LD_LIBRARY_PATH -x PATH -x WM_PROJECT_DIR -x WM_PROJECT_INST_DIR -x WM_OPTIONS -x FOAM_LIBBIN -x FOAM_APPBIN -x FOAM_USER_APPBIN -x MPI_BUFFER_SIZE interFoam -parallel

lvalvare May 17, 2013 16:55

Hi Linnemann,

Thank you for your prompt response.

Indeed it is using open mpirun.

Code:

which mpirun
This is the output:

~/OpenFOAM/ThirdParty-2.2.x/platforms/linux64Gcc/openmpi-1.6.3/bin/mpirun

But, I am still getting the same error when I do:

Code:

mpirun -np 4 -x LD_LIBRARY_PATH -x PATH -x WM_PROJECT_DIR -x WM_PROJECT_INST_DIR -x WM_OPTIONS -x FOAM_LIBBIN -x FOAM_APPBIN -x FOAM_USER_APPBIN -x MPI_BUFFER_SIZE interFoam -parallel
Error displays:

--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------

Thank you.

Laura

lvalvare May 17, 2013 17:13

I personally think that this is a bug in the centfoam distribution rather than something specific to the supercomputer environment.

wyldckat May 17, 2013 17:59

Greetings to all!

@Laura:
Quote:

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
This message seems to indicate that it's not able to access the shared memory capabilities. From what I found online, this can happen when the wrong build options are chosen.

AFAIK, there are two possible solutions that should give the best results:
  1. You can try to disable using all communications, except to use a specific one, possibly Ethernet or Infiniband... I think this can be achieved with something like:
    Code:

    mpirun --mca btl tcp -np 4 interFoam -parallel
    Or for IB:
    Code:

    mpirun --mca btl openib -np 4 interFoam -parallel
    For more information, read post #8 from here: http://www.cfd-online.com/Forums/ope...tml#post251468
  2. Or you should check which is the supercomputer's own MPI toolbox that should be used and configure OpenFOAM's Pstream library to be built with that MPI toolbox. For this, have a look into OpenFOAM's "etc/bashrc" file and search for the entry "WM_MPLIB", adjust accordingly and source the file again... or simply start a new terminal :)
    If by any chance it's not a specific MPI toolbox from the ones exemplified in the "bashrc" file... Then see http://www.cfd-online.com/Forums/ope...tml#post340456 - see post #2
Best regards,
Bruno

linnemann May 18, 2013 15:47

Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:

rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake

After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.

lvalvare May 20, 2013 13:54

Linnemann,

Thank you very much for your help. I follow your instructions and it worked for me as well.

Best.

Laura

PeterX30 May 20, 2013 17:50

It seems I'm experiencing the same problem.

I have updated from 2.1.x to 2.2.x (git repository / Third Party 2.2.0)
on Debian Wheezy in a small 3 node cluster on thze master node.
While so far everything works on the master node with its 8 cores in parallel, I get the above described error as soon as a i define the other nodes in the hostfile. I hoped that I get it fixed quickly by following the instructions of linnemann, but this was unfortunately not the case in my situation. The same error occured again.

The installation is only on the central master node, all paths are defined correctly to access openFoam executables on the master node. This worked fine on the 2.1.x and previous installations. I have checked the paths configuration for openmpi and get by "which mpirun" from all nodes the same correct path of the 2.2.x installation. I also checked and confirmed that I can start from each node OF executables on the master node.

Any suggestions and help is higly welcome!!!

Best Regards,
Peter

wyldckat May 21, 2013 17:57

Greetings to all!

@Peter: I guess you ignored post #5 :rolleyes: More specifically, the second bullet point.
I say this because from your description, it looks like you've been using the Open-MPI versions that are distributed with OpenFOAM. Problem is that OpenFOAM 2.2 provides Open-MPI 1.6.3, which might act/build in a different way from the older versions.

Therefore, if you use the Debian's own Open-MPI version, you shouldn't have any more problems. Since you're using Debian Wheezy, I think you can take a look at these instructions that are directed towards installing OpenFOAM 2.2.0 on Ubuntu 12.04: http://openfoamwiki.net/index.php/In...u#Ubuntu_12.04 - more specifically steps #1, #3 and #4...

Best regards,
Bruno

Antons May 23, 2013 08:49

Quote:

Originally Posted by linnemann (Post 428473)
Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:

rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake

After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.

Hi everyone!
Many thanks for your help! I just want to let you know that this fix worked for me as well (centFOAM on centOS 6.4).
Cheers,

PeterX30 May 25, 2013 11:24

@ Bruno,

thank`s a lot for your suggestion. Obviously Debian Wheezy has problems with openmpi 1.6.3 or viceversa. Following your proposal, to use Wheezy`s own openmpi (1.4.5), fixed the problem. I changed to:

"export WM_MPLIB= SYSTEMOPENMPI"

in "OpenFOAM-2.2.x/etc/bashrc", recompiled and after that it worked.

Best Regards,
Peter:)

linnemann August 1, 2013 11:35

New versions of centFoam is now up that fixes the problem, but also has all the newest git pull from the 01-08-2013.

r_gordon August 30, 2013 10:06

Quote:

Originally Posted by linnemann (Post 428473)
Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:

rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake

After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.

This worked a treat :). Thanks for the help.

skyinventorbt October 9, 2013 00:01

I have seen my friend using OpenFOAM with few nodes in cluster, with one node as master and other nodes as slaves. (Rocks & CentOS)
--
KANNAN

kingmaker October 17, 2013 08:57

Problem with Centfoam
 
@linnemann

Hello linnemann

I am facing the same problem with the latest download of CentFoam 6.x which as pre source forge is updated on 20-08-2013. But the problem persists.

I also noticed that now the version of MPI is 1.7.2 rather than 1.6.3 for which the method to solve the problem is defined.

Regards
Aditya

linnemann October 18, 2013 04:54

Hi

Just replace 1.6.3 with 1.7.2 in the how to get it to work.

I still dont know why this error happens after it gets unpacked on a different machine.

venturi March 11, 2014 09:48

I've got same problem with manual installation of centFOAM OpenFOAM-2.3.x for CentOS 6

The solution presented by linnemann didn't work for me.

I'll try the python installation with centFOAM.py --OF22 and see if it works

ma-tri-x May 9, 2014 10:23

Hi!

I found somewhere the command:
unset LD_PRELOAD

which solved the problem for me.

greg.cfd June 11, 2014 02:39

Hello,

I'm trying to run OpenFoam 2.3.0 in parallel on CentOS 6.2 and I have the same problem described in the original post.

I have already tried all the above proposed solutions like changing the build options to use the system's openmpi and recompiling, however the following message keeps coming up:

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS

Everything works fine in serial simulation.

I'd like to ask if someone who had the same problem found another workaround and managed to run OpenFoam 2.3 in parallel.

Thanks alot in advance,
Greg

wyldckat June 28, 2014 15:28

Greetings Greg and welcome to the forum!

If you could provide more information on how you've proceeded to perform the installation of OpenFOAM 2.3.0 on your machine, it would help to start diagnosing why you're getting that error message.

In addition, have a look at the following thread for more diagnostic steps: http://www.cfd-online.com/Forums/ope...-2-centos.html

Best regards,
Bruno


All times are GMT -4. The time now is 12:06.