CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > OpenFOAM Running, Solving & CFD

Problem running OpenFOAM 2.2.x in parallel in Centos 5

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree21Likes

Reply
 
LinkBack Thread Tools Display Modes
Old   May 16, 2013, 18:36
Default Problem running OpenFOAM 2.2.x in parallel in Centos 5
  #1
New Member
 
Laura Alvarez
Join Date: Jun 2012
Posts: 4
Rep Power: 6
lvalvare is on a distinguished road
Can you please help me with this problem?

I am trying to run OpenFOAM 2.2.x in parallel. I am working in a Centos 5 supercomputer.

I am trying to run a very simple case from the tutorials called damBreak in parallel. And when I type:

mpirun -np 4 interFoam -parallel

The following error pop up:

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[saguaro1.local:22583] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 79
[saguaro1.local:22583] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orterun.c at line 694
lvalvare is offline   Reply With Quote

Old   May 17, 2013, 15:43
Default
  #2
Senior Member
 
linnemann's Avatar
 
Niels Nielsen
Join Date: Mar 2009
Location: NJ - Denmark
Posts: 473
Rep Power: 16
linnemann will become famous soon enough
Hi try with these extra input to mpirun

Also check that its using the openfoam mpirun and not the system mpirun.

You can check this by sourcing the OF environment and then do

Code:
which mpirun
that should ouput a path which should contain "Thirdparty-2.2.x" in it.

Code:
mpirun -np 4 -x LD_LIBRARY_PATH -x PATH -x WM_PROJECT_DIR -x WM_PROJECT_INST_DIR -x WM_OPTIONS -x FOAM_LIBBIN -x FOAM_APPBIN -x FOAM_USER_APPBIN -x MPI_BUFFER_SIZE interFoam -parallel
__________________
Linnemann

PS. I do not do personal support, so please post in the forums.
linnemann is offline   Reply With Quote

Old   May 17, 2013, 16:55
Default
  #3
New Member
 
Laura Alvarez
Join Date: Jun 2012
Posts: 4
Rep Power: 6
lvalvare is on a distinguished road
Hi Linnemann,

Thank you for your prompt response.

Indeed it is using open mpirun.

Code:
which mpirun
This is the output:

~/OpenFOAM/ThirdParty-2.2.x/platforms/linux64Gcc/openmpi-1.6.3/bin/mpirun

But, I am still getting the same error when I do:

Code:
 mpirun -np 4 -x LD_LIBRARY_PATH -x PATH -x WM_PROJECT_DIR -x WM_PROJECT_INST_DIR -x WM_OPTIONS -x FOAM_LIBBIN -x FOAM_APPBIN -x FOAM_USER_APPBIN -x MPI_BUFFER_SIZE interFoam -parallel
Error displays:

--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------

Thank you.

Laura
lvalvare is offline   Reply With Quote

Old   May 17, 2013, 17:13
Default
  #4
New Member
 
Laura Alvarez
Join Date: Jun 2012
Posts: 4
Rep Power: 6
lvalvare is on a distinguished road
I personally think that this is a bug in the centfoam distribution rather than something specific to the supercomputer environment.
lvalvare is offline   Reply With Quote

Old   May 17, 2013, 17:59
Default
  #5
Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 9,748
Blog Entries: 39
Rep Power: 103
wyldckat is a glorious beacon of lightwyldckat is a glorious beacon of lightwyldckat is a glorious beacon of lightwyldckat is a glorious beacon of lightwyldckat is a glorious beacon of light
Greetings to all!

@Laura:
Quote:
opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
This message seems to indicate that it's not able to access the shared memory capabilities. From what I found online, this can happen when the wrong build options are chosen.

AFAIK, there are two possible solutions that should give the best results:
  1. You can try to disable using all communications, except to use a specific one, possibly Ethernet or Infiniband... I think this can be achieved with something like:
    Code:
    mpirun --mca btl tcp -np 4 interFoam -parallel
    Or for IB:
    Code:
    mpirun --mca btl openib -np 4 interFoam -parallel
    For more information, read post #8 from here: Almost have my cluster running openfoam, but not quite...
  2. Or you should check which is the supercomputer's own MPI toolbox that should be used and configure OpenFOAM's Pstream library to be built with that MPI toolbox. For this, have a look into OpenFOAM's "etc/bashrc" file and search for the entry "WM_MPLIB", adjust accordingly and source the file again... or simply start a new terminal
    If by any chance it's not a specific MPI toolbox from the ones exemplified in the "bashrc" file... Then see Problems with MPI implementation - see post #2
Best regards,
Bruno
mgg and zqlhzx like this.
__________________
wyldckat is offline   Reply With Quote

Old   May 18, 2013, 15:47
Default
  #6
Senior Member
 
linnemann's Avatar
 
Niels Nielsen
Join Date: Mar 2009
Location: NJ - Denmark
Posts: 473
Rep Power: 16
linnemann will become famous soon enough
Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:
rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake
After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.
__________________
Linnemann

PS. I do not do personal support, so please post in the forums.
linnemann is offline   Reply With Quote

Old   May 20, 2013, 13:54
Default
  #7
New Member
 
Laura Alvarez
Join Date: Jun 2012
Posts: 4
Rep Power: 6
lvalvare is on a distinguished road
Linnemann,

Thank you very much for your help. I follow your instructions and it worked for me as well.

Best.

Laura
lvalvare is offline   Reply With Quote

Old   May 20, 2013, 17:50
Default
  #8
New Member
 
anonymous
Join Date: Jan 2012
Posts: 7
Rep Power: 6
PeterX30 is on a distinguished road
It seems I'm experiencing the same problem.

I have updated from 2.1.x to 2.2.x (git repository / Third Party 2.2.0)
on Debian Wheezy in a small 3 node cluster on thze master node.
While so far everything works on the master node with its 8 cores in parallel, I get the above described error as soon as a i define the other nodes in the hostfile. I hoped that I get it fixed quickly by following the instructions of linnemann, but this was unfortunately not the case in my situation. The same error occured again.

The installation is only on the central master node, all paths are defined correctly to access openFoam executables on the master node. This worked fine on the 2.1.x and previous installations. I have checked the paths configuration for openmpi and get by "which mpirun" from all nodes the same correct path of the 2.2.x installation. I also checked and confirmed that I can start from each node OF executables on the master node.

Any suggestions and help is higly welcome!!!

Best Regards,
Peter
PeterX30 is offline   Reply With Quote

Old   May 21, 2013, 17:57
Default
  #9
Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 9,748
Blog Entries: 39
Rep Power: 103
wyldckat is a glorious beacon of lightwyldckat is a glorious beacon of lightwyldckat is a glorious beacon of lightwyldckat is a glorious beacon of lightwyldckat is a glorious beacon of light
Greetings to all!

@Peter: I guess you ignored post #5 More specifically, the second bullet point.
I say this because from your description, it looks like you've been using the Open-MPI versions that are distributed with OpenFOAM. Problem is that OpenFOAM 2.2 provides Open-MPI 1.6.3, which might act/build in a different way from the older versions.

Therefore, if you use the Debian's own Open-MPI version, you shouldn't have any more problems. Since you're using Debian Wheezy, I think you can take a look at these instructions that are directed towards installing OpenFOAM 2.2.0 on Ubuntu 12.04: http://openfoamwiki.net/index.php/In...u#Ubuntu_12.04 - more specifically steps #1, #3 and #4...

Best regards,
Bruno
__________________
wyldckat is offline   Reply With Quote

Old   May 23, 2013, 08:49
Default
  #10
New Member
 
Join Date: May 2013
Posts: 1
Rep Power: 0
Antons is on a distinguished road
Quote:
Originally Posted by linnemann View Post
Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:
rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake
After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.
Hi everyone!
Many thanks for your help! I just want to let you know that this fix worked for me as well (centFOAM on centOS 6.4).
Cheers,
Antons is offline   Reply With Quote

Old   May 25, 2013, 11:24
Default
  #11
New Member
 
anonymous
Join Date: Jan 2012
Posts: 7
Rep Power: 6
PeterX30 is on a distinguished road
@ Bruno,

thank`s a lot for your suggestion. Obviously Debian Wheezy has problems with openmpi 1.6.3 or viceversa. Following your proposal, to use Wheezy`s own openmpi (1.4.5), fixed the problem. I changed to:

"export WM_MPLIB= SYSTEMOPENMPI"

in "OpenFOAM-2.2.x/etc/bashrc", recompiled and after that it worked.

Best Regards,
Peter
wyldckat likes this.
PeterX30 is offline   Reply With Quote

Old   August 1, 2013, 11:35
Default
  #12
Senior Member
 
linnemann's Avatar
 
Niels Nielsen
Join Date: Mar 2009
Location: NJ - Denmark
Posts: 473
Rep Power: 16
linnemann will become famous soon enough
New versions of centFoam is now up that fixes the problem, but also has all the newest git pull from the 01-08-2013.
wyldckat likes this.
__________________
Linnemann

PS. I do not do personal support, so please post in the forums.
linnemann is offline   Reply With Quote

Old   August 30, 2013, 10:06
Default
  #13
New Member
 
Rob Gordon
Join Date: Aug 2013
Posts: 8
Rep Power: 5
r_gordon is on a distinguished road
Quote:
Originally Posted by linnemann View Post
Hi

I installed a virtual machine and found the same problem with mpirun.

To fix it you need to have the system gcc and gcc-c++ installed.
Otherwise you will get a compile error.

So here comes the fix

Code:
rm -rf $FOAM_EXT_LIBBIN/../../linux64Gcc/openmpi-1.6.3
cd $FOAM_INST_DIR/ThirdParty-2.2.x
./Allwmake

cd $FOAM_SRC/Pstream/dummy
wclean
cd ../mpi
wclean
cd ..
./Allwmake
After that it was working for me.

I will have a centFoam version up with the fixes, but it will take some time to upload etc.
This worked a treat . Thanks for the help.
r_gordon is offline   Reply With Quote

Old   October 9, 2013, 00:01
Default
  #14
Member
 
B T KANNAN
Join Date: Jul 2011
Location: CHENNAI (MADRAS), INDIA
Posts: 54
Rep Power: 7
skyinventorbt is on a distinguished road
I have seen my friend using OpenFOAM with few nodes in cluster, with one node as master and other nodes as slaves. (Rocks & CentOS)
--
KANNAN
skyinventorbt is offline   Reply With Quote

Old   October 17, 2013, 08:57
Default Problem with Centfoam
  #15
New Member
 
Aditya
Join Date: May 2013
Location: Munich Germany
Posts: 28
Rep Power: 5
kingmaker is on a distinguished road
@linnemann

Hello linnemann

I am facing the same problem with the latest download of CentFoam 6.x which as pre source forge is updated on 20-08-2013. But the problem persists.

I also noticed that now the version of MPI is 1.7.2 rather than 1.6.3 for which the method to solve the problem is defined.

Regards
Aditya
kingmaker is offline   Reply With Quote

Old   October 18, 2013, 04:54
Default
  #16
Senior Member
 
linnemann's Avatar
 
Niels Nielsen
Join Date: Mar 2009
Location: NJ - Denmark
Posts: 473
Rep Power: 16
linnemann will become famous soon enough
Hi

Just replace 1.6.3 with 1.7.2 in the how to get it to work.

I still dont know why this error happens after it gets unpacked on a different machine.
__________________
Linnemann

PS. I do not do personal support, so please post in the forums.
linnemann is offline   Reply With Quote

Old   March 11, 2014, 09:48
Default
  #17
New Member
 
D. N. Venturi
Join Date: Jan 2013
Location: Blumenau, SC - Brazil
Posts: 2
Rep Power: 0
venturi is on a distinguished road
I've got same problem with manual installation of centFOAM OpenFOAM-2.3.x for CentOS 6

The solution presented by linnemann didn't work for me.

I'll try the python installation with centFOAM.py --OF22 and see if it works
venturi is offline   Reply With Quote

Old   May 9, 2014, 10:23
Default
  #18
New Member
 
Join Date: Sep 2013
Posts: 23
Rep Power: 5
ma-tri-x is on a distinguished road
Hi!

I found somewhere the command:
unset LD_PRELOAD

which solved the problem for me.
ma-tri-x is offline   Reply With Quote

Old   June 11, 2014, 02:39
Default
  #19
New Member
 
Join Date: May 2014
Posts: 8
Rep Power: 4
greg.cfd is on a distinguished road
Hello,

I'm trying to run OpenFoam 2.3.0 in parallel on CentOS 6.2 and I have the same problem described in the original post.

I have already tried all the above proposed solutions like changing the build options to use the system's openmpi and recompiling, however the following message keeps coming up:

It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS

Everything works fine in serial simulation.

I'd like to ask if someone who had the same problem found another workaround and managed to run OpenFoam 2.3 in parallel.

Thanks alot in advance,
Greg
greg.cfd is offline   Reply With Quote

Old   June 28, 2014, 15:28
Default
  #20
Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 9,748
Blog Entries: 39
Rep Power: 103
wyldckat is a glorious beacon of lightwyldckat is a glorious beacon of lightwyldckat is a glorious beacon of lightwyldckat is a glorious beacon of lightwyldckat is a glorious beacon of light
Greetings Greg and welcome to the forum!

If you could provide more information on how you've proceeded to perform the installation of OpenFOAM 2.3.0 on your machine, it would help to start diagnosing why you're getting that error message.

In addition, have a look at the following thread for more diagnostic steps: unable to run in parallel with OpenFOAM 2.2 on CentOS

Best regards,
Bruno
wyldckat is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
problem of running parallel Fluent on linux cluster ivanbuz FLUENT 12 November 16, 2016 14:38
RSH problem for parallel running in CFX Nicola CFX 5 June 18, 2012 18:31
Problem in running job in parallel Tarak OpenFOAM 0 March 19, 2011 21:34
Problem running a parallel fluent job on local machine via mpd highhopes FLUENT 0 March 3, 2011 06:07
Running Problem using Openfoam solver cfd_staruser OpenFOAM 5 August 14, 2009 02:28


All times are GMT -4. The time now is 23:11.