CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

Problems running OpenFOAM 2.3 in parallel

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   April 22, 2014, 11:42
Default Problems running OpenFOAM 2.3 in parallel
  #1
Senior Member
 
Vincent RIVOLA
Join Date: Mar 2009
Location: France
Posts: 283
Rep Power: 18
vinz is on a distinguished road
Dear all,

Since I installed OpenFOAM 2.3 I've not been able to use it in parallel.
I don't know why. It's been working perfectly for years with the previous versions and this one is giving me headache with two different machines.

I am using Ubuntu 12.04, and I get the following error as soon as I try to run in parallel (this exemple is with Allrun in motorbike tutorial, but it's the same for every solver):

Quote:
--------------------------------------------------------------------------
A requested component was not found, or was unable to be opened. This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded). Note that
Open MPI stopped checking at the first component that it did not find.

Host: carbon
Framework: crs
Component: none
--------------------------------------------------------------------------
[carbon:17798] *** Process received signal ***
[carbon:17798] Signal: Segmentation fault (11)
[carbon:17798] Signal code: Address not mapped (1)
[carbon:17798] Failing at address: 0x28
[carbon:17798] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x2ada7058c4a0]
[carbon:17798] [ 1] /usr/lib/libopen-pal.so.0(mca_base_select+0x108) [0x2ada72d29518]
[carbon:17798] [ 2] /usr/lib/libopen-pal.so.0(opal_crs_base_select+0x7e) [0x2ada72d3b90e]
[carbon:17798] [ 3] /usr/lib/libopen-pal.so.0(opal_cr_init+0x31e) [0x2ada72d1a0ee]
[carbon:17798] [ 4] /usr/lib/libopen-pal.so.0(opal_init+0x159) [0x2ada72d19a59]
[carbon:17798] [ 5] /usr/lib/libopen-rte.so.0(orte_init+0x4d) [0x2ada72ac5a0d]
[carbon:17798] [ 6] /usr/lib/libmpi.so.0(+0x362e1) [0x2ada7221a2e1]
[carbon:17798] [ 7] /usr/lib/libmpi.so.0(MPI_Init+0x16b) [0x2ada7223b3fb]
[carbon:17798] [ 8] /home/vincent/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64GccDPOpt/lib/openmpi-1.6.5/libPstream.so(_ZN4Foam8UPstream4initERiRPPc+0xd) [0x2ada7091b9bd]
[carbon:17798] [ 9] /home/vincent/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64GccDPOpt/lib/libOpenFOAM.so(_ZN4Foam7argListC1ERiRPPcbbb+0xb32) [0x2ada6f458fc2]
[carbon:17798] [10] snappyHexMesh() [0x411d6a]
[carbon:17798] [11] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x2ada7057776d]
[carbon:17798] [12] snappyHexMesh() [0x416c3d]
[carbon:17798] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 17795 on node carbon exited on signal 11 (Segmentation fault).
Does someone have an idea of what's going one?
Regarding the setup I used the source files and compiled everything. After few times I managed to get no compilation errors but I am not able to run the cases in parallel yet.

Thanks for your help

Vincent
vinz is offline   Reply With Quote

Old   April 22, 2014, 14:28
Default
  #2
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Greetings Vincent,

Which installation instructions did you follow?

Because according to the output you've provided, the problem is that the shell environment is configured to using the custom Open-MPI 1.6.5 that comes with OpenFOAM's ThirdParty package, but it's instead using the "libmpi.so" library present in your system, which is not compatible.

Best regards,
Bruno
__________________
wyldckat is offline   Reply With Quote

Old   April 23, 2014, 01:19
Default
  #3
Member
 
Christian Butcher
Join Date: Jul 2013
Location: Japan
Posts: 85
Rep Power: 12
chrisb2244 is on a distinguished road
Possibly on the same topic, does OF-2.3.0 have a higher requirement of some kind for the version of OpenMPI?

Currently I have an installation of OF-2.3.0 on the cluster I work with, and for values of $NSLOTS less than or equal to 14, everything works perfectly.
When I try and run with more then 14 processors, I get errors like:

Code:
qrsh_starter: executing child process (null) failed: No such file or directory
--------------------------------------------------------------------------
A daemon (pid 13339) died unexpectedly with status 1 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
The version of OpenMPI on the cluster is 1.6.3. OF is configured with
FOAM_MPI = openmpi-system
and
WM_MPLIB = SYSTEMOPENMPI

With both 14 and 80 processors, the mpirun command is used via a qsub'd script (Sun Grid Engine)

I'm further confused about the number 14. The cluster contains a collection of nodes, each with two 8-core processors, ie, 16 processing cores per node. Consequently, a limit of 16 would make me think I have problems communicating between nodes (although I have password-less ssh connections), but 14 seems a little peculiar. Edit: Pretty sure this is actually due to memory limits - the amount of memory I requested was slightly higher than the mem/proc available, so only 14 of the 16 cores could be used, since 14 * mem/proc was all of the memory on the node. So I guess this isn't curious at all, just when I ask for a 15th processor, it requires a second node.

It's been a little while since I tried, but I'm pretty sure under OF-2.2.2 I had 32 cores working without issue.

Best,
Christian

Last edited by chrisb2244; April 23, 2014 at 21:19. Reason: Information about why 14 procs is ok and 15 is not.
chrisb2244 is offline   Reply With Quote

Old   April 23, 2014, 02:42
Default
  #4
Senior Member
 
Vincent RIVOLA
Join Date: Mar 2009
Location: France
Posts: 283
Rep Power: 18
vinz is on a distinguished road
Quote:
Originally Posted by wyldckat View Post
Greetings Vincent,

Which installation instructions did you follow?

Because according to the output you've provided, the problem is that the shell environment is configured to using the custom Open-MPI 1.6.5 that comes with OpenFOAM's ThirdParty package, but it's instead using the "libmpi.so" library present in your system, which is not compatible.

Best regards,
Bruno
Dear Bruno,

Thanks for your reply.
Actually, I would like to run my system mpirun, which is the one I normally used with the previous versions of OpenFOAM.
But even explicitely calling the system mpirun (/usr/bin/mpirun -np 6 snappyHexMesh -parallel) I get a similar error:
Code:
--------------------------------------------------------------------------
A requested component was not found, or was unable to be opened.  This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded).  Note that
Open MPI stopped checking at the first component that it did not find.

Host:      carbon
Framework: crs
Component: none
--------------------------------------------------------------------------
[carbon:22893] *** Process received signal ***
[carbon:22893] Signal: Segmentation fault (11)
[carbon:22893] Signal code: Address not mapped (1)
[carbon:22893] Failing at address: 0x28
[carbon:22893] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0) [0x2aff99ca4cb0]
[carbon:22893] [ 1] /usr/lib/libopen-pal.so.0(mca_base_select+0x108) [0x2aff99a41518]
[carbon:22893] [ 2] /usr/lib/libopen-pal.so.0(opal_crs_base_select+0x7e) [0x2aff99a5390e]
[carbon:22893] [ 3] /usr/lib/libopen-pal.so.0(opal_cr_init+0x31e) [0x2aff99a320ee]
[carbon:22893] [ 4] /usr/lib/libopen-pal.so.0(opal_init+0x159) [0x2aff99a31a59]
[carbon:22893] [ 5] /usr/lib/libopen-rte.so.0(orte_init+0x4d) [0x2aff997dea0d]
[carbon:22893] [ 6] /usr/bin/mpirun() [0x402fe5]
[carbon:22893] [ 7] /usr/bin/mpirun() [0x402b34]
[carbon:22893] [ 8] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x2aff99ed376d]
[carbon:22893] [ 9] /usr/bin/mpirun() [0x402a59]
[carbon:22893] *** End of error message ***
Segmentation fault (core dumped)
This is the reason why I tried to add the thirdParty libs to my path.
Regarding, the instructions I tried to follow the ones I found on openfoam.com:
http://www.openfoam.org/download/source.php

What do you suggest to fix this setup?

Some more information, this is my LD_LIBRARY_PATH:
Code:
echo $LD_LIBRARY_PATH
/home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64Gcc/CGAL-4.3/lib:/home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64Gcc/ParaView-4.1.0/lib/paraview-4.1:/home/vincent/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64GccDPOpt/lib/openmpi-1.6.5:/home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64GccDPOpt/lib/openmpi-1.6.5:/home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64Gcc/openmpi-1.6.5/lib:/home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64Gcc/openmpi-1.6.5/lib64:/home/vincent/OpenFOAM/vincent-2.3.0/platforms/linux64GccDPOpt/lib:/home/vincent/OpenFOAM/site/2.3.0/platforms/linux64GccDPOpt/lib:/home/vincent/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64GccDPOpt/lib:/home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64GccDPOpt/lib:/home/vincent/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64GccDPOpt/lib/dummy
(carbon) ~/OpenFOAM/vincent-2.3.0/run/tutorials/incompressible/simpleFoam/motorBike > ls -latr /home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64GccDPOpt/lib/openmpi-1.6.5
I don't see why it is looking into /usr/lib

Last edited by wyldckat; April 25, 2014 at 14:56. Reason: merged posts <1h apart and changed QUOTE to CODE
vinz is offline   Reply With Quote

Old   April 25, 2014, 15:04
Default
  #5
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Greetings to all!

@Christian: If I read your post correctly, you figured out that the problem was that more memory was need than there was available on the 1st node. Therefore, mystery solved


@Vincent: If you followed the instructions from http://www.openfoam.org/download/source.php - and did not modify the setting in the variable "WM_MPLIB" to "SYSTEMOPENMPI", in the file "$HOME/OpenFOAM/OpenFOAM-2.3.0/etc/bashrc", then you have a conflict of settings, because you've built OpenFOAM with the custom Open-MPI and then you're trying to use the system's Open-MPI, which is likely incompatible. To know which mpirun it's being used, run:
Code:
which mpirun
Now, if you did properly modify OpenFOAM's "bashrc" file, then it might be something else. How have you set-up the OpenFOAM shell environment to be activated? Namely, did you add this line to the end of your personal "~/.bashrc" file?
Code:
source $HOME/OpenFOAM/OpenFOAM-2.3.0/etc/bashrc
Best regards,
Bruno
__________________
wyldckat is offline   Reply With Quote

Old   June 16, 2019, 10:32
Default
  #6
New Member
 
Elias Trautner
Join Date: Jun 2019
Posts: 4
Rep Power: 6
tre95 is on a distinguished road
Hello wyldckat, I have a similar issue, thread:

https://www.cfd-online.com/Forums/op...imulation.html
It would be very nice if you could check it out and see whether you can help me to get rid of the bug. Thanks in advance!
tre95 is offline   Reply With Quote

Old   December 2, 2019, 16:49
Default Problems running OpenFOAM 2.3 in parallel
  #7
Member
 
Join Date: Mar 2019
Posts: 81
Rep Power: 7
mm66 is on a distinguished road
I am trying to run OpenFOAM while sharing the resources between two computers. I included the hostfile but am getting the following error:

Code:
[vm2:26669] *** Process received signal ***
[vm2:26669] Signal: Segmentation fault (11)
[vm2:26669] Signal code: Address not mapped (1)
[vm2:26669] Failing at address: 0x5634a8006d6e
[vm2:26669] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f51f2147890]
[vm2:26669] [ 1] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x3d)[0x7f51f1ddb98d]
[vm2:26669] [ 2] /usr/lib/x86_64-linux-gnu/libopen-pal.so.20(opal_argv_free+0x29)[0x7f51f23a2519]
[vm2:26669] [ 3] /usr/lib/x86_64-linux-gnu/libopen-rte.so.20(+0x283cb)[0x7f51f262e3cb]
[vm2:26669] [ 4] /usr/lib/x86_64-linux-gnu/libopen-rte.so.20(orte_util_add_hostfile_nodes+0xc1)[0x7f51f262f3f1]
[vm2:26669] [ 5] /usr/lib/x86_64-linux-gnu/libopen-rte.so.20(orte_ras_base_allocate+0xd3d)[0x7f51f26607fd]
[vm2:26669] [ 6] /usr/lib/x86_64-linux-gnu/libopen-pal.so.20(opal_libevent2022_event_base_loop+0xdc9)[0x7f51f23ba209]
[vm2:26669] [ 7] mpirun(+0x74a3)[0x5634a6d7e4a3]
[vm2:26669] [ 8] mpirun(+0x5aea)[0x5634a6d7caea]
[vm2:26669] [ 9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f51f1d65b97]
[vm2:26669] [10] mpirun(+0x59ea)[0x5634a6d7c9ea]
[vm2:26669] *** End of error message ***
How can this be resolved?

PS: I am using OpenFOAM v1812:
Code:
$echo $WM_MPLIB
SYSTEMOPENMPI

$echo $FOAM_MPI
openmpi-system

Last edited by mm66; December 3, 2019 at 16:03.
mm66 is offline   Reply With Quote

Old   December 3, 2019, 11:03
Unhappy
  #8
Member
 
Join Date: Mar 2019
Posts: 81
Rep Power: 7
mm66 is on a distinguished road
I figured out what was wrong. In the host file I was using this format:

Code:
user@ip cpu=N
whereas it should have been:

Code:
ip cpu=N

Last edited by mm66; December 3, 2019 at 16:03.
mm66 is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
[ICEM] Problems with coedge curves and surfaces tommymoose ANSYS Meshing & Geometry 6 December 1, 2020 11:12
Can not run OpenFOAM in parallel in clusters, help! ripperjack OpenFOAM Running, Solving & CFD 5 May 6, 2014 15:25
Problems running in parallel - Pstream not available dark lancer OpenFOAM Installation 14 October 13, 2013 14:13
Problem in Running OpenFoam in Parallel himanshu28 OpenFOAM Running, Solving & CFD 1 July 11, 2013 09:19
Something weird encountered when running OpenFOAM in parallel on multiple nodes xpqiu OpenFOAM Running, Solving & CFD 2 May 2, 2013 04:59


All times are GMT -4. The time now is 12:27.