Bruno,
it works in parallel with foamJob but it is very very very slow. I think the problem is linked with something else. I am not able to recompile the ThirdParty-1.6.x directory ! [111]cfs10-sanchi /shared/OpenFOAM/ThirdParty-1.6.x % ./Allclean set: Variable name must begin with a letter. [112]cfs10-sanchi /shared/OpenFOAM/ThirdParty-1.6.x % ./Allwmake set: Variable name must begin with a letter. Stephane. |
Bruno,
yes I have run the 2 ways in the same terminal. Hereafter is the error message with the following command : [128]cfs10-sanchi /shared/sanchi/interDyMFoam/ras/essai % mpirun -np 6 -hostfile machines /shared/OpenFOAM/OpenFOAM-1.6.x/bin/foamExec interDyMFoam -parallel | tee log orted: Command not found. -------------------------------------------------------------------------- A daemon (pid 10888) died unexpectedly with status 1 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that the job aborted, but has no info as to the process that caused that situation. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun was unable to cleanly terminate the daemons on the nodes shown below. Additional manual cleanup may be required - please refer to the "orte-clean" tool for assistance. -------------------------------------------------------------------------- cfs9 - daemon did not report back when launched cfs11 - daemon did not report back when launched orted: Command not found. Regards, Stephane. |
Stephane,
I've got a feeling that the problem is linked to your Linux box. If I'm not mistaken, OpenFOAM assumes that the application sh is assumed to be the same as dash, thus the weird errors you got from Allwmake and Allclean. Quote:
So, I suggest trying to figure out who is seeing what... in other words:
By the way, I missed the findExec command the first time I looked at the code :( I believe it's an OpenFOAM script for finding its own applications. Best regards, Bruno |
Hi,
the problem with Allwmake and Allclean is solved. I have switched between tcsh and bash. sanchi@cfs10:~> ls -l `which mpirun` lrwxrwxrwx 1 sanchi cfs 7 2010-05-31 12:15 /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/mpirun -> orterun sanchi@cfs10:~> ls -l `findExec mpirun` ls: cannot access findExec:: No such file or directory ls: cannot access command: No such file or directory ls: cannot access not: No such file or directory ls: cannot access found: No such file or directory Do you mean findExec or foamExec ? Stephane. |
Quote:
Quote:
Quote:
Code:
ls -l /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/orterun Code:
which foamJob Code:
echo "Executing: mpirun Code:
echo "Executing: $mpirun Best regards, Bruno |
Hi Bruno,
the link orterun exists ! ls -l /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/orterun gives -rwx------ 1 sanchi cfs 102096 2009-07-15 12:35 /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/orterun As suggested I have changed mpirun to $mpirun into the foamJob script: [106]cfs10-sanchi /shared/sanchi/interDyMFoam/ras/essai % foamJob -s -p interDyMFoam Parallel processing using OPENMPI with 6 processors Executing: /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/mpirun -np 6 -hostfile machines /shared/OpenFOAM/OpenFOAM-1.6.x/bin/foamExec interDyMFoam -parallel | tee log It is strange because: the command which mpirun gives /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/mpirun and the below command doesn't run : mpirun -np 6 -hostfile machines /shared/OpenFOAM/OpenFOAM-1.6.x/bin/foamExec interDyMFoam -parallel | tee logorted: Command not found. -------------------------------------------------------------------------- A daemon (pid 25028) died unexpectedly with status 1 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that the job aborted, but has no info as to the process that caused that situation. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun was unable to cleanly terminate the daemons on the nodes shown below. Additional manual cleanup may be required - please refer to the "orte-clean" tool for assistance. -------------------------------------------------------------------------- cfs9 - daemon did not report back when launched cfs11 - daemon did not report back when launched orted: Command not found. Regards, Stephane. |
Greetings Stephane,
Quote:
Well, there are various possible solutions at hand... you can try these two commands to run mpirun:
Bruno |
Hi Bruno,
Using the commands unlink and then ln -s doesn't work at all ! How it works : 1. Using foamJob -s -p interFoam 2. /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/mpirun -np 4 -hostfile machines /shared/OpenFOAM/OpenFOAM-1.6.x/bin/foamExec interFoam -parallel | tee log How it doesn't work : 1. mpirun -np 4 -hostfile machines /shared/OpenFOAM/OpenFOAM-1.6.x/bin/foamExec interFoam -parallel | tee log orted: Command not found. -------------------------------------------------------------------------- A daemon (pid 17492) died unexpectedly with status 1 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that the job aborted, but has no info as to the process that caused that situation. -------------------------------------------------------------------------- mpirun: clean termination accomplished 2. /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/mpirun -np 4 -hostfile machines interFoam -parallel | tee log -------------------------------------------------------------------------- mpirun was unable to launch the specified application as it could not find an executable: Executable: interFoam Node: cfs9 while attempting to start process rank 2. -------------------------------------------------------------------------- I would like to understand why, if possible. In the past I have used the following command without problem : mpirun -hostfile machines -np 4 interFoam -parallel > log Regards, Stephane. |
Hi Stephane,
Quote:
My guess is still what I've said before, and "unlink and then ln -s" not working are indicative that something just isn't working properly with the linking mechanism of your Linux installation. Have you tried the copy possibility? Or did you copy first and then tried the linking possibility? I should have listed them in the opposite order... Because after copying, unlink is no longer applicable, since mpirun would no longer be a link! Best regards, Bruno |
Bruno,
Yes, I have run in parallel with OF-1.6.x. And at least with openSUSE 11.1. Now I have openSUSE 11.2. Maybe this can help solve my problem. [114]cfs10-sanchi /home/sanchi/test_openmpi % ssh $HOST `which mpirun` /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/mpirun: error while loading shared libraries: libopen-rte.so.0: cannot open shared object file: No such file or directory Regards, Stephane. |
Hi Stephane,
Quote:
Like I've asked before, have you tried copying orted to mpirun? Quote:
Bruno |
Bruno,
I have copied *orted* to *mpirun*. And now even the foamJob command does not work anymore. 111]cfs10-sanchi /shared/sanchi/interDyMFoam/ras/damBreak % foamJob -s -p interFoam Parallel processing using OPENMPI with 4 processors Executing: /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/mpirun -np 4 -hostfile machines /shared/OpenFOAM/OpenFOAM-1.6.x/bin/foamExec interFoam -parallel | tee log [cfs10:24965] Error: unknown option "-np" input in flex scanner failed Stephane. |
Hi Stephane,
Quote:
To fix it: Code:
cp /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/orterun /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/mpirun Code:
rm /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/mpirun Code:
rm /shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/mpirun I'm going to fix the other posts, just in case someone else tries to run those commands :( Best regards, Bruno |
Hi Bruno,
I have installed version 1.7.0. I have run some tutorials (motorbike and mixer2D) in serial and it works. But I can't run any case in parallel. [106]cfs10-sanchi /shared/sanchi/OpenFOAM/sanchi-1.6.x/Eole % mpirun --hostfile machines -np 4 MRFSimpleFoam -parallel > log orted: Command not found. -------------------------------------------------------------------------- A daemon (pid 17287) died unexpectedly with status 1 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that the job aborted, but has no info as to the process that caused that situation. -------------------------------------------------------------------------- mpirun: clean termination accomplished And now the command foamJob does not work neither ! Stephane. |
Hi Stephane,
Well, you changed OpenFOAM version, but not Linux version ;) OpenFOAM 1.7.0 right now is almost just like 1.6.x, except for a few tweaks and addons. So, try one of steps in my previous post, but this time don't forget it's:
Bruno |
Bruno,
no it doesn't work. Always same error message. foamJob -s -p doesn't neither. Stephane. |
OK, that Linux box doesn't seem to want to work with OpenMPI :P
Stephane do me a little favor and post what you get when you run this command: Code:
ls -l /shared/OpenFOAM/ThirdParty-1.7.0/openmpi-1.4.1/platforms/linux64GccDPOpt/bin And what comes out from these commands: Code:
echo MPI_ARCH_PATH=$MPI_ARCH_PATH Bruno |
Hi Bruno,
The folder /shared/OpenFOAM/ThirdParty-1.7.0/openmpi-1.4.1/platforms/linux64GccDPOpt/bin doesn't exist. It is located in /shared/OpenFOAM/ThirdParty-1.7.0/platforms/linux64Gcc/openmpi-1.4.1/bin/ [116]cfs10-sanchi /home/sanchi % ls -l /shared/OpenFOAM/ThirdParty-1.7.0/platforms/linux64Gcc/openmpi-1.4.1/bin/ total 3064 lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpic++ -> opal_wrapper lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpicc -> opal_wrapper lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpiCC -> opal_wrapper lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpicc-vt -> opal_wrapper lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpiCC-vt -> opal_wrapper lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpic++-vt -> opal_wrapper lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpicxx -> opal_wrapper lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpicxx-vt -> opal_wrapper lrwxrwxrwx 1 sanchi cfs 7 2010-06-30 16:56 mpiexec -> orterun lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpif77 -> opal_wrapper lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpif77-vt -> opal_wrapper lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpif90 -> opal_wrapper lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 mpif90-vt -> opal_wrapper -rwx------ 1 sanchi cfs 106795 2010-07-01 14:47 mpirun lrwxrwxrwx 1 sanchi cfs 7 2010-06-30 16:56 mpirun.old -> orterun lrwxrwxrwx 1 sanchi cfs 15 2010-06-30 16:56 ompi-checkpoint -> orte-checkpoint lrwxrwxrwx 1 sanchi cfs 10 2010-06-30 16:56 ompi-clean -> orte-clean -rwxr-xr-x 1 sanchi cfs 141953 2010-06-30 16:56 ompi_info lrwxrwxrwx 1 sanchi cfs 8 2010-06-30 16:56 ompi-iof -> orte-iof lrwxrwxrwx 1 sanchi cfs 7 2010-06-30 16:56 ompi-ps -> orte-ps lrwxrwxrwx 1 sanchi cfs 12 2010-06-30 16:56 ompi-restart -> orte-restart -rwxr-xr-x 1 sanchi cfs 18446 2010-06-30 16:56 ompi-server -rwxr-xr-x 1 sanchi cfs 26699 2010-06-30 16:54 opal_wrapper -rwxr-xr-x 1 sanchi cfs 267834 2010-06-30 16:56 opari -rwxr-xr-x 1 sanchi cfs 17783 2010-06-30 16:55 orte-clean -rwxr-xr-x 1 sanchi cfs 12105 2010-06-30 16:55 orted -rwxr-xr-x 1 sanchi cfs 18056 2010-06-30 16:55 orte-iof -rwxr-xr-x 1 sanchi cfs 22637 2010-06-30 16:55 orte-ps -rwxr-xr-x 1 sanchi cfs 106795 2010-06-30 16:55 orterun -rwxr-xr-x 1 sanchi cfs 349960 2010-06-30 16:56 otfaux -rwxr-xr-x 1 sanchi cfs 21803 2010-06-30 16:56 otfcompress -rwxr-xr-x 1 sanchi cfs 16337 2010-06-30 16:56 otfconfig lrwxrwxrwx 1 sanchi cfs 11 2010-06-30 16:56 otfdecompress -> otfcompress -rwxr-xr-x 1 sanchi cfs 185817 2010-06-30 16:56 otfdump -rwxr-xr-x 1 sanchi cfs 173396 2010-06-30 16:56 otfinfo -rwxr-xr-x 1 sanchi cfs 178028 2010-06-30 16:56 otfmerge -rwxr-xr-x 1 sanchi cfs 88842 2010-06-30 16:56 vtcc -rwxr-xr-x 1 sanchi cfs 88842 2010-06-30 16:56 vtcxx -rwxr-xr-x 1 sanchi cfs 88842 2010-06-30 16:56 vtf77 -rwxr-xr-x 1 sanchi cfs 88842 2010-06-30 16:56 vtf90 -rwxr-xr-x 1 sanchi cfs 432965 2010-06-30 16:56 vtfilter -rwxr-xr-x 1 sanchi cfs 581820 2010-06-30 16:56 vtunify [104]cfs10-sanchi /home/sanchi % echo MPI_ARCH_PATH=$MPI_ARCH_PATH MPI_ARCH_PATH=/shared/OpenFOAM/ThirdParty-1.7.0/platforms/linux64Gcc/openmpi-1.4.1 [105]cfs10-sanchi /home/sanchi % echo OPAL_PREFIX=$OPAL_PREFIX OPAL_PREFIX=/shared/OpenFOAM/ThirdParty-1.7.0/platforms/linux64Gcc/openmpi-1.4.1 [106]cfs10-sanchi /home/sanchi % echo PATH=$PATH PATH=/shared/OpenFOAM/ThirdParty-1.7.0/platforms/linux64Gcc/paraview-3.8.0/bin:/shared/OpenFOAM/ThirdParty-1.7.0/platforms/linux64Gcc/openmpi-1.4.1/bin:/home/sanchi/OpenFOAM/sanchi-1.7.0/applications/bin/linux64GccDPOpt:/shared/OpenFOAM/site/1.7.0/bin/linux64GccDPOpt:/shared/OpenFOAM/OpenFOAM-1.7.0/applications/bin/linux64GccDPOpt:/shared/OpenFOAM/OpenFOAM-1.7.0/wmake:/shared/OpenFOAM/OpenFOAM-1.7.0/bin:/soft/CEI/bin:/soft/intel/compiler81_fce/bin:/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin:/soft/smr/bin:/usr/bin/X11:/usr/games:/opt/kde3/bin:/home/sanchi/bin:/usr/lib64/jvm/jre/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:/soft/ParaView/ParaView3.6/bin:.:/soft/icemcfd/x86_64/linux/bin:/tmp/vos/EDGE/bin:/home/sanchi/MB-Split/bin/LINUX:/opt/mpich/bin:/soft/NTI/bin:/shared/vos/bin:/home/vos/nsmb/util/fortran2html_4.4/bin:/shared/gehri/bin:/tmp/vos/EDGE/bin:/home/sanchi/MB-Split/bin/LINUX:/soft/NTI/bin:/home/vos/nsmb/util/fortran2html_4.4/bin Regards, Stephane. |
Hi Stephane,
Quote:
OK, as for the problem you're getting... I'm getting kinda stumped as to what is going on :( The few things that come to mind are:
Bruno |
Hi Bruno,
The path */opt/mpich/bin* in *PATH* does not exist no more. But still does not run in parallel. The *foamJob* output is : [110]cfs10-sanchi /shared/sanchi/OpenFOAM/sanchi-1.6.x/Eole % foamJob -s -p MRFSimpleFoam Parallel processing using OPENMPI with 4 processors Executing: mpirun -np 4 -hostfile machines /shared/OpenFOAM/OpenFOAM-1.7.0/bin/foamExec MRFSimpleFoam -parallel | tee log [111]cfs10-sanchi /shared/sanchi/OpenFOAM/sanchi-1.6.x/Eole % The log file is empty. With the *parallelTest* application the results is the same. The log file is empty.[110]cfsfs-sanchi /shared/sanchi/OpenFOAM/sanchi-1.6.x/Eole % foamJob -s -p parallelTest Parallel processing using OPENMPI with 4 processors Executing: mpirun -np 4 -hostfile machines /shared/OpenFOAM/OpenFOAM-1.7.0/bin/foamExec parallelTest -parallel | tee log [111]cfsfs-sanchi /shared/sanchi/OpenFOAM/sanchi-1.6.x/Eole % Stephane. |
All times are GMT -4. The time now is 18:23. |