CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Running, Solving & CFD (https://www.cfd-online.com/Forums/openfoam-solving/)
-   -   OpenMPI bash: orted: comand not found error (https://www.cfd-online.com/Forums/openfoam-solving/74806-openmpi-bash-orted-comand-not-found-error.html)

fijinx April 9, 2010 01:21

OpenMPI bash: orted: comand not found error
 
I have been using OpenFOAM for around 3 months and am now trying to run simulations in parallel. I have run into a problem where when I try to run a simulation in parallel I get the error message:
bash: orted: command not found
--------------------------------------------------------------------------
A daemon (pid 3388) died unexpectedly with status 127 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
mpirun: clean termination accomplished
Few notes:
I have the file system mounted via NFS.
I am using non-interactive login
I start with the command:
mpirun -np 6 -hostfile hosts buoyantPisoFoam -parallel
The two computers are mapped via /etc/hosts
I can connect via SSH manually no problem
I updated .bashrc on both machines to reflect the ...OpenFOAM-1.6/etc/bashrc
Here is my decomposeParDict:
/*--------------------------------*- C++ -*----------------------------------*\
| ========= | |
| \\ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \\ / O peration | Version: 1.6 |
| \\ / A nd | Web: www.OpenFOAM.org |
| \\/ M anipulation | |
\*---------------------------------------------------------------------------*/
FoamFile
{
version 2.0;
format ascii;
class dictionary;
location "system";
object decomposeParDict;
}
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

numberOfSubdomains 6;

method simple;

simpleCoeffs
{
n ( 1 3 2 );
delta 0.001;
}

hierarchicalCoeffs
{
n ( 1 1 1 );
delta 0.001;
order xyz;
}

metisCoeffs
{
processorWeights ( 1 1 1 1 1 1 );
}

manualCoeffs
{
dataFile "";
}

distributed no;

roots ( );
If anyone has an idea on why this wont work, I would invite any suggestions! Thank you

A.Devesa April 9, 2010 06:41

Could it be that you have different OS versions on your hosts? I experienced some problems when i tried to run my code on 2 machines that do not run under the same version of linux distribution, even if my code and all the required libraries are on the nfs...

fijinx April 9, 2010 11:39

Thank you for the reply, but no, All have the same version of linux (Ubuntu 9.10) installed from the same CD. Also as a side note I CAN run parallel processes on any single machine, just not across machines.

fijinx April 9, 2010 12:28

Update:
When I invoke mpirun using the command:
openfoam@openfoamserver1:~/OpenFOAM/openfoam-1.6/run/run/40puTEST$ /home/openfoam/OpenFOAM/ThirdParty-1.6/openmpi-1.3.3/platforms/linuxGccDPOpt/bin/mpirun -hostfile hosts -np 6 buoyantPisoFoam -parallel
I now get the error:
--------------------------------------------------------------------------
mpirun was unable to launch the specified application as it could not find an executable:

Executable: buoyantPisoFoam
Node: openfoamserver2

while attempting to start process rank 0.
--------------------------------------------------------------------------

fijinx April 9, 2010 12:53

I figured out the problem.

During non-interactive login (i.e. not manually logging on to the computer) the user specific environmental variables are NOT loaded. To fix this I just added a file .ssh/environment and put the command:

. $HOME/.bashrc

This made it so when you connect via non-interactive SSH, it runs the .bashrc script.

fijinx April 9, 2010 15:42

This seemed to work at first, but now is not. Any ideas? I get the same error as before.

A.Devesa April 9, 2010 16:30

well i thought you were on a good track with your source bash, since the type of console you're opening through direct interactive login or batch login can be different.

i would try to add the line: source .bashrc in your .csh or .profile, just in case you're using csh or tcsh consoles while batch logging...

fijinx April 9, 2010 16:53

Is there any other place that I can define my environmental variables? /etc/profile is not run during non-interactive login. The path is set by the "rsh daemon". Anyone know how to fix this?

fijinx April 9, 2010 17:21

Also, do All the OF nodes need to be able to talk to eachother via SSH, or only to the main computer running the mpirun command?

wyldckat April 9, 2010 17:30

Greetings fijinx,

Have you tried using OpenFOAM's foamJob or runParallel scripts, instead of using mpirun directly?

Personally, I've had to deal with this issue in the past with Windows+MSys, so I edited the foamJob script and added the environment variables by hand to the mpirun options.
I've checked OpenMPI's and the switch for adding environment variables seems to be "-x".

Best regards,
Bruno

fijinx April 9, 2010 17:55

Ok I definately got it now! I just added the ..../etc/bashrc as the FIRST line in the .bashrc file (before it calls the if non-interactive do nothing) and it works!

phan April 9, 2010 18:47

I get the same error message - though I'm manually logged in! I'm able to access the other machine with ssh and mpi runs on a local dual core. Any ideas?

fijinx April 9, 2010 20:27

Did you setup a NFS share? And also should should setup non-interactive login. If you like I can send you the directions to setup both.

phan April 10, 2010 06:24

I figured out that I was using different versions of libopenmpi!
But I still have problems:

stephan@falcon:~/OpenFOAM/OpenFOAM-1.6.x/tutorials/multiphase/interFoam/laminar/damBreak$ mpirun -np 3 --hostfile $HOME/machines interFoam
--------------------------------------------------------------------------
mpirun was unable to launch the specified application as it could not find an executable:

Executable: interFoam
Node: condor

while attempting to start process rank 2.
--------------------------------------------------------------------------
[falcon:27393] [[2325,0],0]-[[2325,1],1] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104)


The non-interactive login already seems to work. And the NFS share does also! Which folders have to be shared? Right now I have my /home/user folder shared (but can't see this folder in the network, but other folders which I share can be seen).

Any further ideas?

Thanks!

fijinx April 10, 2010 12:16

Yeah you need to have your /home/OpenFOAM folder shared so that the paths are identical across all computers. The easiest way is to mount it using /etc/fstab . Here are the directions I used for ubuntu:

http://www.ubuntugeek.com/nfs-server...in-ubuntu.html

It looks like the problem you may be having is either the paths aren't the same on both computers, or your environmental variables aren't being loaded (add the . $HOME/OpenFOAM/OpenFOAM-1.6/etc/bashrc line as the first line in your $HOME/.bashrc file).

Also I have no clue how important it is, but you're missing the -parallel tag at the end of your command.

openfoam_user May 26, 2010 11:05

Hi James,

I have the same problem !

You said :
Ok I definately got it now! I just added the ..../etc/bashrc as the FIRST line in the .bashrc file (before it calls the if non-interactive do nothing) and it works!

Can you explain it in more details. Thanks.
What did you add exactly ?
Which .bashrc file ?

Best regards,

Stephane.

openfoam_user May 28, 2010 09:18

Dear OF-users,

I got the following error message when I run a case in parallel.

1. With the following command:
mpirun --hostfile machines -np 10 interFoam -parallel > log

orted: Command not found.
--------------------------------------------------------------------------
A daemon (pid 18117) died unexpectedly with status 1 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun was unable to cleanly terminate the daemons on the nodes shown
below. Additional manual cleanup may be required - please refer to
the "orte-clean" tool for assistance.
--------------------------------------------------------------------------
cfs6 - daemon did not report back when launched
cfs7 - daemon did not report back when launched
cfs8 - daemon did not report back when launched
cfs9 - daemon did not report back when launched
[106]cfs10-sanchi /home/sanchi/sphere_air_water_essai % orted: Command not found.
orted: Command not found.
orted: Command not found.

2. And with the complete link:
/shared/OpenFOAM/ThirdParty-1.6.x/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/mpirun --hostfile machines -np 10 interFoam -parallel > log
--------------------------------------------------------------------------
mpirun was unable to launch the specified application as it could not find an executable:

Executable: interFoam
Node: cfs7

while attempting to start process rank 2.
--------------------------------------------------------------------------

Do not know how to solve it !!!

Best regards,
Stephane.

wyldckat May 29, 2010 09:14

Greetings Stephane,

Have you tried this:
Code:

foamJob -s -p interFoam
The -s argument will also display the current execution on screen, but will also output the contents into the file log; you can remove that argument if you only want the file log. And the foamJob script looks for the file machines on its own, so you don't need to define it yourself as an argument to foamJob.


If it doesn't work, then here is the overkill method to run mpirun without the need to change .bashrc in all machines, but a note of caution: this assumes that all machines will use the same username and the same path to OpenFOAM and case path. Here it is:
Code:

mpienvopts=`echo \`env | grep WM_ | sed -e "s/=.*$//"\` | sed -e "s/ / -x /g"`
mpienvopts2=`echo \`env | grep FOAM_ | sed -e "s/=.*$//"\` | sed -e "s/ / -x /g"`
mpirun --hostfile machines -np 10 -x HOME -x PATH -x USERNAME -x LD_LIBRARY_PATH -x MPI_BUFFER_SIZE -x $mpienvopts -x $mpienvopts2 interFoam -parallel > log

The first two lines collect all OpenFOAM environment variables that start with "WM_" and with "FOAM_" and adds the "-x" option to them. The last line will execute with the added "-x" options to your own command, thus exporting all of your essential local environment variables to all machines via mpirun.

Like I said, this is an overkill way to launch mpirun. The easiest way is to simply use foamJob which will launch foamExec on its own. foamExec sources OpenFOAM's etc/bashrc, thus activating the OpenFOAM environment on the remotely launched applications.

Best regards,
Bruno

openfoam_user May 31, 2010 09:47

Hi Bruno,

thanks a lot for all the explanations.

With foamJob -s -p interDyMFoam it seem to work. Why ?

foamJob -s -p interDyMFoam
Parallel processing using OPENMPI with 6 processors
Executing: mpirun -np 6 -hostfile machines /shared/OpenFOAM/OpenFOAM-1.6.x/bin/foamExec interDyMFoam -parallel | tee log

If I run directly the below command it dosn't work ! Why ?
mpirun -np 6 -hostfile machines /shared/OpenFOAM/OpenFOAM-1.6.x/bin/foamExec interDyMFoam -parallel | tee log

Best regards,

Stephane.

wyldckat May 31, 2010 10:21

Greetings Stephane,

Quote:

Originally Posted by openfoam_user (Post 261010)

If I run directly the below command it dosn't work ! Why ?
mpirun -np 6 -hostfile machines /shared/OpenFOAM/OpenFOAM-1.6.x/bin/foamExec interDyMFoam -parallel | tee log

foamJob is magical? :) I'm just kidding. Actually it is a bit odd... did you run both commands in the same terminal? Without the error output is a bit difficult to deduce the reason for not working :(

So my guess is that you didn't run in the same terminal as you did foamJob, or in other words, the OpenFOAM environment wasn't active in that particular terminal. My other guess is that when restarting the mpirun + interDyMFoam, the case was already solved and/or needs to be reset before running again.

Best regards,
Bruno


All times are GMT -4. The time now is 06:56.