CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Installation

[OpenFOAM.org] Non-root installation of OpenFOAM 2.4.x, parallel issue

Register Blogs Community New Posts Updated Threads Search

Like Tree3Likes
  • 1 Post By syavash
  • 1 Post By wyldckat
  • 1 Post By syavash

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   February 28, 2019, 16:42
Default Non-root installation of OpenFOAM 2.4.x, parallel issue
  #1
Senior Member
 
Syavash Asgari
Join Date: Apr 2010
Posts: 473
Rep Power: 18
syavash is on a distinguished road
Hi Foamers,

I have been trying to install OF 2.4.x on a CentOS 7 cluster. It was fine in serial but when I try to submit a job in parallel I get the following error:

Code:
Could not retrieve MPI tag from  /home/.../.../.../OpenFOAM/OpenFOAM-2.4.x/platforms/linux64GccDPOpt/bin/pimpleFoam
To be indicated, I intend to use system gcc as the compiler. I use the following commands to submit the job:

Code:
#!/bin/bash

#SBATCH -N 4
#SBATCH -t 00:10:00
#SBATCH -J ...
#SBATCH -A ...


export FOAM_INST_DIR=/home/.../.../.../OpenFOAM

. $FOAM_INST_DIR/OpenFOAM-2.4.x/etc/bashrc

#--nranks is used when less than the number of available cores is used

mpprun  --nranks=128 `which pimpleFoam` -parallel >& results
To have access to OpenMPI, I load a gcc module and OpenMPI is loaded as the result:

Code:
The buildenv-gcc module makes available:
 - Compilers: gcc, gfortran, etc.
 - MPI library with mpi-wrapped compilers: OpenMPI with mpicc, mpifort, etc.
 - Numerical libraries: OpenBLAS, LAPACK, ScaLAPACK, FFTW
I guess that OpenFOAM cannot find the system mpi on the cluster so it complains accordingly. I hope someone had similar experience with this issue and could help me out.

Regards,
Syavash
syavash is offline   Reply With Quote

Old   February 28, 2019, 17:20
Default
  #2
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Quick answer: Given that OpenFOAM's "etc/bashrc" file is being used in the job script, then you also need to load in the job script the module for Open-MPI, before sourcing/activating OpenFOAM's shell environment.
__________________
wyldckat is offline   Reply With Quote

Old   March 1, 2019, 05:36
Default
  #3
Senior Member
 
Syavash Asgari
Join Date: Apr 2010
Posts: 473
Rep Power: 18
syavash is on a distinguished road
Quote:
Originally Posted by wyldckat View Post
Quick answer: Given that OpenFOAM's "etc/bashrc" file is being used in the job script, then you also need to load in the job script the module for Open-MPI, before sourcing/activating OpenFOAM's shell environment.
Dear Bruno,

Thanks for your reply. I tried to include the following line in the job script:

Code:
export FOAM_MPI=/software/sse/easybuild/prefix/modules/all/Compiler/GCC/6.4.0-2.28/OpenMPI
The above line was added prior to
Code:
. $FOAM_INST_DIR/OpenFOAM-2.4.x/etc/bashrc
However, I still get the same error after submitting the job. My question is that am I loading Open-MPI in a correct way? Could you help me to figure it out?

Regards,
Syavash
syavash is offline   Reply With Quote

Old   March 1, 2019, 11:57
Default
  #4
Member
 
Fatih Ertinaz
Join Date: Feb 2011
Location: Istanbul
Posts: 64
Rep Power: 15
fertinaz is on a distinguished road
Hello Ehsan


How exactly did you install OF, using openmpi module in the cluster or the openmpi shipped with OF?


If you want to stick to the cluster's OMPI (which is the right way usually), then be sure that you made the required modifications in the OF's bashrc file. To check if everything is built correctly, you can have a look at the $FOAM_LIBBIN contents as well. Also, I'd suggest ldd tool to check if dependencies can be resolved.
fertinaz is offline   Reply With Quote

Old   March 1, 2019, 14:44
Default
  #5
Senior Member
 
Syavash Asgari
Join Date: Apr 2010
Posts: 473
Rep Power: 18
syavash is on a distinguished road
Quote:
Originally Posted by fertinaz View Post
Hello Ehsan


How exactly did you install OF, using openmpi module in the cluster or the openmpi shipped with OF?


If you want to stick to the cluster's OMPI (which is the right way usually), then be sure that you made the required modifications in the OF's bashrc file. To check if everything is built correctly, you can have a look at the $FOAM_LIBBIN contents as well. Also, I'd suggest ldd tool to check if dependencies can be resolved.
Dear Fatih,

Thanks for your attention. Honestly, I am not so sure, however I intend to use system default gcc/open-mpi modules. I followed the steps described in the following link:

HTML Code:
https://openfoamwiki.net/index.php/Installation/Linux/OpenFOAM-2.4.x/CentOS_SL_RHEL#CentOS_7.1
I skipped the first three steps. I also replaced the command in step 8 with the one below:

Code:
module load buildenv-gcc/2018a-eb
The above command outputs:

Code:
You have loaded an gcc buildenv module
***************************************************
The buildenv-gcc module makes available:
 - Compilers: gcc, gfortran, etc.
 - MPI library with mpi-wrapped compilers: OpenMPI with mpicc, mpifort, etc.
 - Numerical libraries: OpenBLAS, LAPACK, ScaLAPACK, FFTW
Please let me know if you need further information to diagnose the problem. I am confused now!

Regards,
Syavash

Edit: when I enter the command below
Code:
cd $FOAM_LIBBIN
I am redirected to the following address:

Code:
/home/.../.../.../OpenFOAM/OpenFOAM-2.4.x/platforms/linux64GccDPOpt/lib
syavash is offline   Reply With Quote

Old   March 1, 2019, 15:31
Default
  #6
Member
 
Fatih Ertinaz
Join Date: Feb 2011
Location: Istanbul
Posts: 64
Rep Power: 15
fertinaz is on a distinguished road
All right, I think the problem is the module you loaded doesn't actually load openmpi environment. It seems like it is a compiler module to access actual compilers (not just runtime libs) as well as to load the other modules built with that specific gcc version.

To check whether you have ompi in your environment, you can run something like: "which mpirun". Since I don't know how your cluster is configured you might want to check mpi compilers as well with "which mpicc". If you don't have those binaries located somewhere then you need to load the specific openmpi module and re-start the compilation.

If you have them in your environment and also if you're sure that built was completed using correct openmpi modules, then your job script can be wrong as well.

So in general:
== Before initiating the installation load correct modules of gcc and openmpi. Rest is optional (boost, metis etc.). After loading them, make sure they are loaded properly. You can use commands like which, module show etc.
== Check OF path is correctly defined in the etc/bashrc file (e.g. $HOME/OpenFOAM)
== Make sure "export WM_MPLIB=SYSTEMOPENMPI" is defined in OF/etc/bashrc when you use the ompi module
== After the installation check if mpi directories exist (e.g. openmpi-system) under the $FOAM_LIBBIN. So just being able to cd there doesn't mean much.
== To run solvers, meshers etc. load the same modules required for build
== Don't forget to source OF/etc/bashrc in your scripts
== Don't put an alias for it since multi-node jobs need to source it by default

Good luck

// Fatih
fertinaz is offline   Reply With Quote

Old   March 2, 2019, 15:43
Default
  #7
Senior Member
 
Syavash Asgari
Join Date: Apr 2010
Posts: 473
Rep Power: 18
syavash is on a distinguished road
Quote:
Originally Posted by fertinaz View Post
All right, I think the problem is the module you loaded doesn't actually load openmpi environment. It seems like it is a compiler module to access actual compilers (not just runtime libs) as well as to load the other modules built with that specific gcc version.

To check whether you have ompi in your environment, you can run something like: "which mpirun". Since I don't know how your cluster is configured you might want to check mpi compilers as well with "which mpicc". If you don't have those binaries located somewhere then you need to load the specific openmpi module and re-start the compilation.

If you have them in your environment and also if you're sure that built was completed using correct openmpi modules, then your job script can be wrong as well.

So in general:
== Before initiating the installation load correct modules of gcc and openmpi. Rest is optional (boost, metis etc.). After loading them, make sure they are loaded properly. You can use commands like which, module show etc.
== Check OF path is correctly defined in the etc/bashrc file (e.g. $HOME/OpenFOAM)
== Make sure "export WM_MPLIB=SYSTEMOPENMPI" is defined in OF/etc/bashrc when you use the ompi module
== After the installation check if mpi directories exist (e.g. openmpi-system) under the $FOAM_LIBBIN. So just being able to cd there doesn't mean much.
== To run solvers, meshers etc. load the same modules required for build
== Don't forget to source OF/etc/bashrc in your scripts
== Don't put an alias for it since multi-node jobs need to source it by default

Good luck

// Fatih
Dear Fatih,

Thanks for your elaboration. Well, I used "which mpirun" and "which mpicc" and I got the following outputs:

Code:
$which mpirun
/software/sse/easybuild/prefix/software/OpenMPI/1.10.3-GCC-5.4.0-2.26/bin/mpirun
Code:
$which mpicc
/software/sse/manual/mpprun/4.0/nsc-wrappers/mpicc
WM_MPLIB=SYSTEMOPENMPI is OK in etc/bashrc.

Also, inside $FOAM_LIBBIN I have "openmpi-system" directory which contains the following objects:

Code:
$ls
libPstream.so  libptscotchDecomp.so
I am not sure of anything else which might be of importance. Please let me know if you (Bruno as well ) had any other idea on how to resolve this issue.

Regards,
Syavash
syavash is offline   Reply With Quote

Old   March 2, 2019, 16:28
Default
  #8
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Quick answer @syavash: Why can't you add this line:
Code:
module load buildenv-gcc/2018a-eb
before this one in the job script:
Code:
. $FOAM_INST_DIR/OpenFOAM-2.4.x/etc/bashrc
Does the case not launch with that?

The other hypothesis is to check if this module will in fact load other modules. If you start a new terminal and run:
Code:
module load buildenv-gcc/2018a-eb
module list
what does it give you?
wyldckat is offline   Reply With Quote

Old   March 3, 2019, 03:12
Default
  #9
Senior Member
 
Syavash Asgari
Join Date: Apr 2010
Posts: 473
Rep Power: 18
syavash is on a distinguished road
Dear Bruno,

Thanks for your reply.

Quote:
Why can't you add this line:
Code:
module load buildenv-gcc/2018a-eb
before this one in the job script:
Code:
. $FOAM_INST_DIR/OpenFOAM-2.4.x/etc/bashrc
Does the case not launch with that?
I had added the following alias for OF which I used to call it before submitting the job:

Code:
alias OF24x='export FOAM_INST_DIR=/home/.../.../.../OpenFOAM; module load buildenv-gcc/2018b-eb; source /home/.../.../.../OpenFOAM/OpenFOAM-2.4.x/etc/bashrc WM_NCOMPPROCS=4 WM_MPLIB=SYSTEMOPENMPI'
So I excluded the buildenv-gcc/2018b-eb module in the job script. Nevertheless, adding the line
Code:
module load buildenv-gcc/2018b-eb
before the line
Code:
. $FOAM_INST_DIR/OpenFOAM-2.4.x/etc/bashrc
did not make any changes and the error is still there.

Quote:
The other hypothesis is to check if this module will in fact load other modules. If you start a new terminal and run:
Code:
module load buildenv-gcc/2018a-eb
module list
what does it give you?
Here is the output for
Code:
module load buildenv-gcc/2018a-eb
:

Code:
You have loaded an gcc buildenv module
***************************************************
The buildenv-gcc module makes available:
 - Compilers: gcc, gfortran, etc.
 - MPI library with mpi-wrapped compilers: OpenMPI with mpicc, mpifort, etc.
 - Numerical libraries: OpenBLAS, LAPACK, ScaLAPACK, FFTW

It also makes a set of dependency library modules available via
the regular module command. Just do:
  module avail
to see what is available.
and the output for
Code:
module list
:

Code:
Currently Loaded Modules:
  1) mpprun/4.0
  2) nsc/.1.0                             (H,S)
  3) EasyBuild/3.5.3-nsc17d8ce4
  4) nsc-eb-scripts/1.0
  5) buildtool-easybuild/3.5.3-nsc17d8ce4
  6) GCCcore/6.4.0
  7) binutils/.2.28                       (H)
  8) GCC/6.4.0-2.28
  9) numactl/.2.0.11                      (H)
 10) hwloc/.1.11.8                        (H)
 11) OpenMPI/.2.1.2                       (H)
 12) OpenBLAS/.0.2.20                     (H)
 13) FFTW/.3.3.7                          (H)
 14) ScaLAPACK/.2.0.2-OpenBLAS-0.2.20     (H)
 15) foss/2018a
 16) buildenv-gcc/2018a-eb

  Where:
   S:  Module is Sticky, requires --force to unload or purge
   H:             Hidden Module
Any idea?

Regards,
Syavash
syavash is offline   Reply With Quote

Old   March 3, 2019, 09:14
Default
  #10
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Quick answer: If using "module load" doesn't work within a job, then it's because it's not able to properly load it... or because mpprun needs to be used properly.

If the cluster/supercomputer you are using has a support page, you should consult it and tell us about it if you cannot understand it!
I did a quick search online for mpprun and found this page: https://www.nsc.liu.se/support/tutorials/mpprun/

Apparently the error message:
Code:
Could not retrieve MPI tag from .../pimpleFoam
is because the application itself does not have the tag needed for mpprun to assess which modules to load.

So if you run in the login node this command:
Code:
dumptag $(which pimpleFoam)
you should get a similar message stating that the tag was not found.

Apparently you must use the "-Nmpi" additional compilation argument, as documented here: https://www.nsc.liu.se/software/buildenv/ and here: https://www.nsc.liu.se/software/mpi-libraries/

Therefore, if my estimates are correct:
  1. Edit the file "OpenFOAM-2.4.x/wmake/rules/linux64Gcc/c++Opt".
  2. Add the entry "-Nmpi" to "c++OPT", e.g.:
    Code:
    c++OPT      = -O3 -Nmpi
  3. And then you have to rebuild OpenFOAM 2.4.x entirely, because this extra build option has to be added to all object files during compiling...
  4. Then again, you could try only rebuilding only pimpleFoam and hope it's enough:
    Code:
    wclean $FOAM_SOLVERS/incompressible/pimpleFoam
    wmake $FOAM_SOLVERS/incompressible/pimpleFoam
  5. Then check if it's tagged properly:
    Code:
    dumptag $(which pimpleFoam)
    which should tell you several details, including a line similar to this one:
    Code:
    Built with MPI:         openmpi 1_6_2_build1
Beyond this, using "module load" should not be necessary in the job script.
__________________
wyldckat is offline   Reply With Quote

Old   March 3, 2019, 16:24
Default
  #11
Senior Member
 
Syavash Asgari
Join Date: Apr 2010
Posts: 473
Rep Power: 18
syavash is on a distinguished road
Quote:
If the cluster/supercomputer you are using has a support page, you should consult it and tell us about it if you cannot understand it!
Dear Bruno, I see your point, however I should say that I never came across the support page! I searched the error message in Google only but found little help. I know it might look weird but I was confused and missed the support web page, for that I am sorry.

I should also thank you again as your thorough elaboration made the job run in parallel. As you instructed I did a fresh installation, this time by including -Nmpi flag.

Regards,
Syavash
wyldckat likes this.
syavash is offline   Reply With Quote

Old   March 3, 2019, 18:21
Default
  #12
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Quick answer: I'm very glad that it worked!

Mmm... it is possible that Google was bias towards me, given my search profile, even though I wasn't logged in...

Then again, when i searched for "Could not retrieve MPI tag from" without the quotes, it didn't give me anything, but with the quotes, it did give me just one answer: http://www.tfd.chalmers.se/~hani/wik.../_Installation - namely one of Håkan Nilsson's wiki pages, but I didn't understand very well the context either, since his solution was to simply use Intel MPI... I did have to research a bit more and happened to look for "mpprun" and started to read through things a bit more carefully.

But I was also expecting that the cluster/supercomputer you are using, would have an instructions page... but I'm glad that NSC has a fairly complete instructions page and managed to help us out!
syavash likes this.
wyldckat is offline   Reply With Quote

Old   March 4, 2019, 07:15
Default
  #13
Senior Member
 
Syavash Asgari
Join Date: Apr 2010
Posts: 473
Rep Power: 18
syavash is on a distinguished road
Quote:
Originally Posted by wyldckat View Post
Quick answer: I'm very glad that it worked!

Mmm... it is possible that Google was bias towards me, given my search profile, even though I wasn't logged in...

Then again, when i searched for "Could not retrieve MPI tag from" without the quotes, it didn't give me anything, but with the quotes, it did give me just one answer: http://www.tfd.chalmers.se/~hani/wik.../_Installation - namely one of Håkan Nilsson's wiki pages, but I didn't understand very well the context either, since his solution was to simply use Intel MPI... I did have to research a bit more and happened to look for "mpprun" and started to read through things a bit more carefully.

But I was also expecting that the cluster/supercomputer you are using, would have an instructions page... but I'm glad that NSC has a fairly complete instructions page and managed to help us out!
Dear Bruno,

You are right about Håkan's page, as I had installed OF 2.3.1 based on his guide and had used Intel as the compiler. However, Intel compiler would give me some issues when I wanted to use mapFields or when trying to extract lines of data by sampleDict. I had previously installed OF 2.3.1 on another cluster with Gcc (though an older version) and I did not have those issues, so I decided to install OF 2.4.x with Gcc this time, hoping that it does not complain with mapFields anymore.
Also, I examined NSC page. As you indicated they have provided a fairly helpful support page so I will check it out before bringing up the issue here!

Regards,
Syavash
wyldckat likes this.

Last edited by syavash; March 4, 2019 at 08:42.
syavash is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
[OpenFOAM.org] OpenFoam installation on CentOs 6.5 without root access arsalan.dryi OpenFOAM Installation 28 March 6, 2021 11:31
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology wyldckat OpenFOAM 17 November 10, 2017 15:54
simpleFoam parallel AndrewMortimer OpenFOAM Running, Solving & CFD 12 August 7, 2015 18:45
Superlinear speedup in OpenFOAM 13 msrinath80 OpenFOAM Running, Solving & CFD 18 March 3, 2015 05:36
Problem installing OpenFOAM 1.5 installation on RHEL 4. vwsj84 OpenFOAM Installation 4 April 23, 2009 04:48


All times are GMT -4. The time now is 09:30.