CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > OpenFOAM Running, Solving & CFD

MPI error

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree1Likes
  • 1 Post By msrinath80

Reply
 
LinkBack Thread Tools Display Modes
Old   September 3, 2007, 17:15
Default Has anyone seen this kind of e
  #1
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 698
Rep Power: 12
msrinath80 is on a distinguished road
Has anyone seen this[1] kind of error message before? I've linked OpenFOAM 1.4.1 with MPICH-compatibility libraries provided by the HP-MPI suite. I first tried linking directly with hpmpi.so and Pstream compiled fine. However, after that whenever I tried compiling any solver, I got the very same error messages that Frank Bos reported a while ago[2]. I believe those errors are due to C++ bindings being enabled when building Pstream. However, I don't know how to disable them when using HP-MPI. As a result I had to switch to MPICH-compatibility libraries provided by HP-MPI which allow me to build both Pstream and my solver without problems. I need to use HP-MPI as the cluster is configured for Voltaire Infiniband switched-fabric interconnect with Hewlett Packard's XC software stack.

Now, in MPICH-compatibility mode, ldd `which icoFoam_1` gives:

[madhavan@matrix ~]$ ldd `which icoFoam_1`
libfiniteVolume.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libfiniteVolume .so (0x0000002a95557000)
libOpenFOAM.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so (0x0000002a96158000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003b8f600000)
libstdc++.so.6 => /home/users/madhavan/OpenFOAM/linux64/gcc-4.2.1/lib64/libstdc++.so.6 (0x0000002a96604000)
libm.so.6 => /lib64/tls/libm.so.6 (0x0000003b8f100000)
libgcc_s.so.1 => /home/users/madhavan/OpenFOAM/linux64/gcc-4.2.1/lib64/libgcc_s.so.1 (0x0000002a96829000)
libc.so.6 => /lib64/tls/libc.so.6 (0x0000003b8f300000)
libPstream.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/hpmpi/libPstrea m.so (0x0000002a96937000)
libtriSurface.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libtriSurface.s o (0x0000002a96a3f000)
libmeshTools.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libmeshTools.so (0x0000002a96bbf000)
libz.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libz.so (0x0000002a96e2d000)
/lib64/ld-linux-x86-64.so.2 (0x0000003b8ef00000)
libmpich.so => /opt/hpmpi/MPICH1.2/lib/linux_amd64/libmpich.so (0x0000002a96f42000)
librt.so.1 => /lib64/tls/librt.so.1 (0x0000003b94000000)
liblagrangian.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/liblagrangian.s o (0x0000002a9706e000)
libhpmpi.so => /opt/hpmpi/MPICH1.2/lib/linux_amd64/libhpmpi.so (0x0000002a97170000)
libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x0000003b8fa00000)
[madhavan@matrix ~]$

Interestingly, if I run a case which is only around 1-2 million cells the run executes perfectly. But it does not for a 6 or 9 million cell case. Which suggests that this has something to do with 32 or 64 bit build issues. Nevertheless, the output from ldd (shown above) seems to suggest the contrary.

I would appreciate if someone could shed some light on this issue. Thanks!


[1] Error message from MPI:

[12]
[12]
[12] --> FOAM FATAL IO ERROR : IOstream::check(const char* operation) : error in IOstream "IOstream" for operation operator>>
(Istream&, List<t>&) : reading first token
[12]
[12] file: IOstream at line 0.
[12]
[12] From function IOstream::fatalCheck(const char* operation) const
[12] in file db/IOstreams/IOstreams/IOcheck.C at line 73.
[12]
FOAM parallel run exiting
[12]
[11]
[11]
[11] --> FOAM FATAL IO ERROR : IOstream::check(const char* operation) : error in IOstream "IOstream" for operation operator>>
(Istream&, List<t>&) : reading first token
[11]
[11] file: IOstream at line 0.
[11]
[11] From function IOstream::fatalCheck(const char* operation) const
[11] in file db/IOstreams/IOstreams/IOcheck.C at line 73.
[11]
FOAM parallel run exiting
[11]
[10]
[10]
[10] --> FOAM FATAL IO ERROR : IOstream::check(const char* operation) : error in IOstream "IOstream" for operation operator>>
(Istream&, List<t>&) : reading first token
[10]
[10] file: IOstream at line 0.
[10]
[10] From function IOstream::fatalCheck(const char* operation) const
[10] in file db/IOstreams/IOstreams/IOcheck.C at line 73.
[10]
FOAM parallel run exiting
[10]
MPI Application rank 12 exited before MPI_Finalize() with status 1
MPI Application rank 11 exited before MPI_Finalize() with status 1


[2] http://www.cfd-online.com/OpenFOAM_D...es/1/2968.html
msrinath80 is offline   Reply With Quote

Old   September 4, 2007, 13:51
Default In LAM&openMPI I just had a lo
  #2
Super Moderator
 
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,416
Rep Power: 16
mattijs is on a distinguished road
In LAM&openMPI I just had a look through the mpi.h to see how to not include the c++ bindings. Maybe there is a similar nice switch in HP-MPI.

Alternatively additionally link in the mpi library that provides the c++ functions.
mattijs is offline   Reply With Quote

Old   September 4, 2007, 14:35
Default Hi Mattijs, I looked into m
  #3
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 698
Rep Power: 12
msrinath80 is on a distinguished road
Hi Mattijs,

I looked into mpi.h and found no such convenient switch. However I located the additional link that provides C++ functions (-lmpiCC). When I add this to my mplibHPMPI rule, the build proceeds fine except for this error message near the end:

/usr/bin/ld: /opt/hpmpi/lib/linux_amd64/libmpiCC.a(intercepts.o): relocation R_X86_64_32S against `MPI::Comm::key_ref_map' can not be used when making a shared object; recompile with -fPIC
/opt/hpmpi/lib/linux_amd64/libmpiCC.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make: *** [/home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/hpmpi/libPstre am.so] Error 1

How should I proceed? Thanks for your help!
msrinath80 is offline   Reply With Quote

Old   September 4, 2007, 15:31
Default Mattijs, Thanks very much for
  #4
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 698
Rep Power: 12
msrinath80 is on a distinguished road
Mattijs, Thanks very much for your inspiration which made me fool around a lot with HP-MPI. I think I have finally solved the problem. Here is the detailed solution in the hopes that it will prove useful for others in a similar predicament.

First I need to poke into mpi.h as Mattijs suggested and find out if they have an easy switch I can pass through PFLAGS in ~/OpenFOAM/OpenFOAM-1.4.1/wmake/rules/linux64Gcc/mplibHPMPI

which by the way is the file you create when you try to build Pstream with an MPI implementation already installed on the cluster.

Note: I also added the following lines to ~/OpenFOAM/OpenFOAM-1.4.1/.bashrc:

elif [ .$WM_MPLIB = .HPMPI ]; then

export HPMPI_ARCH_PATH=/opt/hpmpi

AddLib $HPMPI_ARCH_PATH/lib/linux_amd64/
AddPath $HPMPI_ARCH_PATH/bin

export FOAM_MPI_LIBBIN=$FOAM_LIBBIN/hpmpi


and export WM_MPLIB=HPMPI to ~/OpenFOAM/OpenFOAM-1.4.1/.OpenFOAM-1.4.1/bashrc


Finally, update the Allwmake script in ~/OpenFOAM/OpenFOAM-1.4.1/src/Pstream to include "$WM_MPLIB" = "HPMPI"

Currently for HP-MPI the mplibHPMPI file reads:

PFLAGS = -DHPMP_BUILD_CXXBINDING
PINC = -I/opt/hpmpi/include
PLIBS = -L/opt/hpmpi/lib/linux_amd64 -lhpmpio -lhpmpi -ldl -lmpiCC

As one can see, I have added the -DHPMP_BUILD_CXXBINDING switch to PFLAGS as I found that doing so enables C++ bindings support within HP-MPI. In addition, I also added the -lmpiCC to link the libraries with C++ MPI bindings.

When I tried to build Pstream, it failed with the error message mentioned above in this thread (i.e. relocation error). This is caused by mixing static libraries with shared builds. The solution for the same is to try and find a libmpiCC.so in the HP-MPI installation. I could not find one. So I googled for the same and came up with an alternative proposed by HP[1]. This let me rebuild
libmpiCC.a using my current g++ (supplied with OpenFOAM). However the library was still static. So I googled again on how to create shared libraries and found this link[2]. Now all I had to do was follow the recipe:

g++ -fPIC -c intercepts.cc -I/opt/hpmpi/include -DHPMP_BUILD_CXXBINDING
g++ -fPIC -c mpicxx.cc -I/opt/hpmpi/include -DHPMP_BUILD_CXXBINDING
g++ -shared -Wl,-soname,libmpiCC.so -o libmpiCC.so.1.0.1 intercepts.o mpicxx.o -lc

And finally symlink the libmpiCC.so.1.0.1 to ~/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/hpmpi/libmpiCC.so

Now, my mplibHPMPI file reads:
PFLAGS = -DHPMP_BUILD_CXXBINDING
PINC = -I/opt/hpmpi/include
PLIBS = -L /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/src/mpiCCsrc -L/opt/hpmpi/lib/linux_amd64 -lhpmpio -lhpmpi -ldl
-lmpiCC

And after rebuilding libPstream.so followed by icoFoam_1 (my customized icoFoam solver), ldd `which icoFoam_1` gives:

[madhavan@matrix icoFoam]$ ldd `which icoFoam_1`
libfiniteVolume.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libfiniteVolume .so (0x0000002a95557000)
libOpenFOAM.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so (0x0000002a96158000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003b8f600000)
libstdc++.so.6 => /home/users/madhavan/OpenFOAM/linux64/gcc-4.2.1/lib64/libstdc++.so.6 (0x0000002a96604000)
libm.so.6 => /lib64/tls/libm.so.6 (0x0000003b8f100000)
libgcc_s.so.1 => /home/users/madhavan/OpenFOAM/linux64/gcc-4.2.1/lib64/libgcc_s.so.1 (0x0000002a96829000)
libc.so.6 => /lib64/tls/libc.so.6 (0x0000003b8f300000)
libPstream.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/hpmpi/libPstrea m.so (0x0000002a96937000)
libtriSurface.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libtriSurface.s o (0x0000002a96a4f000)
libmeshTools.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libmeshTools.so (0x0000002a96bcf000)
libz.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libz.so (0x0000002a96e3d000)
/lib64/ld-linux-x86-64.so.2 (0x0000003b8ef00000)
libmpio.so.1 => /opt/hpmpi/lib/linux_amd64/libmpio.so.1 (0x0000002a96f52000)
libmpi.so.1 => /opt/hpmpi/lib/linux_amd64/libmpi.so.1 (0x0000002a9708d000)
libmpiCC.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/hpmpi/libmpiCC. so (0x0000002a972c8000)
liblagrangian.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/liblagrangian.s o (0x0000002a973e4000)

References:
[1] http://docs.hp.com/en/B6060-96024/ch03s02.html
[2] http://tldp.org/HOWTO/Program-Librar...libraries.html
lakeat likes this.
msrinath80 is offline   Reply With Quote

Old   September 4, 2007, 15:38
Default Addendum: One might wish to ad
  #5
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 698
Rep Power: 12
msrinath80 is on a distinguished road
Addendum: One might wish to add the -m64 flag to the g++ command line just to be safe.
msrinath80 is offline   Reply With Quote

Old   September 4, 2007, 18:50
Default Alright, I give up! Even after
  #6
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 698
Rep Power: 12
msrinath80 is on a distinguished road
Alright, I give up! Even after successfully building the application with HP-MPI support, I get the same error message when running a 6-million case. I'm reverting to OpenMPI 1.2.3 for good. If there is one thing I've learnt through this ordeal it is that proprietary software is "EVIL" by design.
msrinath80 is offline   Reply With Quote

Old   October 30, 2007, 09:55
Default Follow these instructions to g
  #7
Senior Member
 
Eugene de Villiers
Join Date: Mar 2009
Posts: 725
Rep Power: 12
eugene is on a distinguished road
Follow these instructions to get HPMPI working:

http://openfoamwiki.net/index.php/HowTo_Pstream

Thanks to Henry and Mattijs for the work-around.
eugene is offline   Reply With Quote

Old   October 30, 2007, 23:19
Default Thanks a lot Eugene for the in
  #8
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 698
Rep Power: 12
msrinath80 is on a distinguished road
Thanks a lot Eugene for the info. Thanks of course to Henry and Mattijs as well. It certainly works. But I will need to check if I can run cases with 4-6 million cells without issues.
msrinath80 is offline   Reply With Quote

Old   October 31, 2007, 11:05
Default Yes, please let me know. My co
  #9
Senior Member
 
Eugene de Villiers
Join Date: Mar 2009
Posts: 725
Rep Power: 12
eugene is on a distinguished road
Yes, please let me know. My connection to the machine I was about to do the tests on has gone down so I have no way of confirming that the fix solves the 6M cell problem as well.
eugene is offline   Reply With Quote

Old   November 9, 2007, 23:17
Default Apologies for the late respons
  #10
Senior Member
 
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 698
Rep Power: 12
msrinath80 is on a distinguished road
Apologies for the late response Eugene. HPMPI works very nicely for large cases as well using the instructions you pointed to earlier. Thanks Henry and Mattijs!
msrinath80 is offline   Reply With Quote

Old   August 5, 2009, 16:07
Default Still there is a problem
  #11
New Member
 
Alireza Mahdavifar
Join Date: Jul 2009
Location: Kingston, ON, Canada
Posts: 4
Rep Power: 8
ali84 is on a distinguished road
As you may know in OpenFOAM-1.5-dev and OpenFOAM-1.6, the file mplibHPMPI has beed added to wmake/rules/$WM_ARCH directory to support HP-MPI and it includes instructions that eugene has linked. I have copmiled Pstream using that settings (mplibHPMPI) but still I get the same error that msrinath80 introduced, for a 3 million gridpoints mesh (or higher) on more than 4 CPUs (1 node).

Last edited by ali84; August 6, 2009 at 01:15.
ali84 is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
strange fluent error: Primitive Error at Node 0 Jean-Baptiste FLUENT 6 September 20, 2012 16:07
error in COMSOL:'ERROR:6164 Duplicate Variable' bhushas Main CFD Forum 1 May 30, 2008 04:35
"Error: Floating point error: invalid number" MI Kim FLUENT 2 January 4, 2007 11:00
Fatal error error writing to tmp No space left on device maka OpenFOAM Installation 2 April 3, 2006 08:48
Error: Floating point error: invalid number Bob FLUENT 3 June 3, 2005 18:11


All times are GMT -4. The time now is 13:57.