Almost have my cluster running openfoam, but not quite...
I present two "successful cases" to describe where I am. Given these, does anybody spot the error in Case 3?
Case 1: "Hello world-ish" on a 6-node by 2-cpu cluster:
~/.bashrc is not yet including /root/OpenFOAM/OpenFOAM-1.6/etc/bashrc
host1:~ # /usr/lib64/mpi/gcc/openmpi/bin/mpirun --mca btl openib,self -machinefile list.txt -np 12 test/comm_size_with_id.out
Process 1 on host2 out of 12
Process 7 on host2 out of 12
Process 2 on host3 out of 12
Process 4 on host5 out of 12
Process 8 on host3 out of 12
Process 10 on host5 out of 12
Process 3 on host4 out of 12
Process 9 on host4 out of 12
Process 5 on host6 out of 12
Process 11 on host6 out of 12
Process 6 on host1 out of 12
Process 0 on host1 out of 12
Case 2: As close as I can get to OpenFOAM working in parallel:
~/.bashrc including OpenFOAM specific environment variables as set by /root/OpenFOAM/OpenFOAM-1.6/etc/bashrc
host1:~ # which mpirun
host1:~ # mpirun -np 12 simpleFoam -parallel -case inletProfile/
| ========= | |
| \\ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \\ / O peration | Version: 1.6 |
| \\ / A nd | Web: www.OpenFOAM.org |
| \\/ M anipulation | |
Build : 1.6-f802ff2d6c5a
Exec : simpleFoam -parallel -case inletProfile/
Date : Mar 23 2010
Time : 12:43:31
Host : host1
PID : 25231
Case : ./inletProfile
nProcs : 12
It goes on to work correctly, but only on one machine with 12 processes
Case 3: Case 2, but including "-machinefile list.txt":
host1:~ # mpirun -np 12 -machinefile list.txt simpleFoam -parallel -case inletProfile/
zsh:1: command not found: orted
A daemon (pid 25297) died unexpectedly with status 127 while attempting
to launch so we are aborting.
There may be more information reported by the environment (see above).
This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
zsh:1: command not found: orted
mpirun was unable to cleanly terminate the daemons on the nodes shown
below. Additional manual cleanup may be required - please refer to
the "orte-clean" tool for assistance.
192.168.1.102 - daemon did not report back when launched
192.168.1.103 - daemon did not report back when launched
192.168.1.104 - daemon did not report back when launched
192.168.1.105 - daemon did not report back when launched
192.168.1.106 - daemon did not report back when launched
zsh:1: command not found: orted
host1:~ # zsh:1: command not found: orted
Getting closer, per...
So I can run interactively, but not in a non-interactive login, per...
host1:~ # ssh host5 $HOME/sum_serial.out
/root/sum_serial.out: error while loading shared libraries: libmpi_cxx.so.0: cannot open shared object file: No such file or directory
The solution is to change the shell back to bash and copy the .bashrc to every machine so that it's sourced when you login. you can change the shell by running
chsh root /bin/bash
chsh admin /bin/bash
and then copy .bashrc to every node to /root (assuming you run mpirun as root)
mpirun -np 12 -machinefile list.txt simpleFoam -parallel -case inletProfile/ > log.simpleFoam.Parallelopenib &
But this (with the '--mca btl, openib,self) doesn't...
mpirun --mca btl openib,self -np 12 -machinefile list.txt simpleFoam -parallel -case inletProfileMonday/ > log.simpleFoam.Parallelopenib &
Do you know of a way to tell if my simulation is using infiniband for sure? I was thinking of pulling some ethernet cables and seeing what happens as a brute force approach. It's configured in such a way that I can picture it being able to us either one.
That error message is this a bunch of times...
A requested component was not found, or was unable to be opened. This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded). Note that
Open MPI stopped checking at the first component that it did not find.
[host3:08394] mca: base: components_open: component pml / csum open function failed
Turns out that it was using the ethernet side without these flags.
Looks like we have the same problem...
In the former case, Open MPI will add itself to PATH and LD_LIBRARY_PATH on all the remote nodes (regardless of what you do in your .bashrc). In the latter case, it will not (meaning: you probably should have added Open MPI to your PATH / LD_LIBRARY_PATH in your .bashrc).
Note that the "/path/to/mpirun ..." form is a shortcut for the --prefix command line option to mpirun. See Open MPI's mpirun(1) man page for details.
If you had used the /path/to/mpirun... form, this case probably would have worked.
Or you could propagate your .bashrc out to all nodes so that all nodes can find Open MPI's executables+libraries, and that should work, too (which, by a later post, I think you did -- but I wanted to explain just so that you knew *why* it worked).
Hence, Open MPI doesn't (yet) give a positive ACK that you're using IB -- it gives a negative ACK if it can't. Enough people have asked for a positivie ACK that we're likely to add it in the v1.5 series sometime.
BTW, you probably actually want to use "--mca btl openb,sm,self". This allows Open MPI to use shared memory for on-node communication (which can be faster than forcing it to loopback through your IB adapters). I don't know much about OpenFOAM to know if this will provide an overall performance boost or not.
Open MPI uses TCP for setup and teardown, even if you're using OpenFabrics transports (IB or iWARP) for MPI communications.
Did you happen to install one version of Open MPI and then install a different version over it? You *may* have Open MPI plugins from different versions in the same installation tree that don't play nicely with each other.
If this is the case, try fully uninstalling Open MPI and then manually inspecting the $prefix/lib/openmpi dir and ensure that there's no plugins left over from a prior Open MPI installation. The install Open MPI again. Let me know if that works.
You can also try:
(you probably won't use the CSUM PML, so it's save to either remove or move to a different location where Open MPI won't find it) I'm not optimistic that that will fix it, but it could (if csum is from a prior Open MPI install).
Thanks for the great clarifications on my untidy explainations.
The $prefix/lib/openmpi/* files (or, more specifically, $libdir/openmpi/*) that I was referring to were Open MPI's plugins. Each plugin has at least 1 file (sometimes 2, depending on your installation) in $libdir/openmpi. For example, $libdir/openmpi/mca_btl_openib.so. That's the BTL openib plugin.
The CSUM PML plugin is the Point-to-point messaging layer plugin named CSUM. The PML is the layer right behind MPI_SEND and friends. Specifically, MPI_SEND calls the back-end PML send function to actually effect the send (and so on). Think of the PML as the engine that drives all the MPI messaging semantics (communicator and tag matching, etc.).
CSUM is a PML that does all the normal sending and receiving, but also does checksums on the data to ensure data integrity. This is a Good Thing, but it definitely imposes a performance overhead. Most transports provide their own data reliability checking (e.g., TCP), so CSUM typically isn't worth it. But some transports can have problems with end-to-end reliablility -- that's why we developed CSUM.
Normally, you should probably use the "ob1" PML. OB1 and CSUM are identical except that CSUM does the checksumming. Specifically, both OB1 and CSUM use BTL plugins underneath the covers to effect point-to-point transmission and reception. Hence, both CSUM and OB1 can use the openib BTL.
That being said, to be totally clear, while you can use multiple BTL plugins in a single run (e.g., openib, sm, and self), you can only use ONE PML at a time. So you'll use CSUM *or* OB1 -- not both. So my point before was that if CSUM was somehow mucked up on your system, you could remove the CSUM .so plugin file and then it wouldn't ever be used. But I wasn't confident that that would fix your problem.
Think I just got it figured out...
host2:~ # which ompi_info
host2:~ # ompi_info | grep openib
MCA btl: openib (MCA v2.0, API v2.0, Component v1.3.3)
Turns out it was in the Allwmake file.
The relevant lines being changed to (for my configuration)...
--enable-shared --disable-static \
--disable-mpi-f77 --disable-mpi-f90 --disable-mpi-cxx \
# These lines enable Infiniband support
Additionally, --enable-shared and --disable-static is also the default.
The --with-openib line doesn't look quite right, but it probably squeaks by the tests we have in configure. Meaning: if you have OFED installed with the default /usr prefix, then Open MPI should be able to find OFED's headers and libraries with no extra help (because they're in the compiler's and linker's default search paths). So you should be able to do just --with-openib (i.e., not list any dir).
But hey, if it works... :-)
Thanks Jeff... this Allwmake is the script that sets up all of our OpenFOAM "3rd party" tools. So the points you make are relevant community-wide and I wonder if I shouldn't try to make those changes and get them checked into the OpenFOAM SVN repository... as the best way to communicate this.
There's no \ after the --disable-mpi-profile line, so it probably ignored your --with-openib line. But then again, if OFED is installed in compiler/linker default locations, the --with-openib option is not strictly necessary because OMPI will find that stuff by default (and therefore build support for it).
Specifically, we treat --with-<foo> options in OMPI's configure thusly:
Hope that helps...
I've actually fixed that, but you make a good point... I think simply recompiling is what solved it, not the path reference... because it did work as shown above.
OpenFOAM working on our cluster!
I did get this working, and would be happy to try to address similar problems with anybody in the future.
|All times are GMT -4. The time now is 20:06.|