CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   Siemens (https://www.cfd-online.com/Forums/siemens/)
-   -   Using SSH instead of RSH for parallel (https://www.cfd-online.com/Forums/siemens/53042-using-ssh-instead-rsh-parallel.html)

Eric October 4, 2002 13:46

Using SSH instead of RSH for parallel
 
Is anyone currently using mpich compiled for ssh and completing star-cd runs on computers that are part of a secure network? I am currently trying to setup star-cd to run with ssh. However all commands issued by prohpc utilize rsh and the parallel portion of star was desined around rsh only. I have alias rsh to ssh so the first 5 steps of the hpc setup work, also our mpich has been re-compiled for ssh and runs other parallel codes just fine, but star is not working. Specifically when I submit star-model-run.sh the connections are refused. Will it be required to re-install star-cd now that the mpich settings are changed? or can I just tell it to compile the exe and point it to the new directory? thanks

Jiaying Xu October 5, 2002 08:03

Re: Using SSH instead of RSH for parallel
 
I had similar situations before, but still your enviroment is not very clear to me:

Are you using ProSTAR and StarHPC on computer A and trying to submit over your jobs to computer B (and run them on B)?

If this is case, you have to 1) specify the licence path for STAR on computer B. For example, if your license server is on A, you then need to put a file ~/.flexmrc on B, you can refer the STAR Installation guide and Release notes for the content of the file. 2) define proper enviroments variables on B such STARDIR, etc.

Another thing unclear, what does computer (A or B) say when STAR is not working? What kinds of connection refused? Is that because you not provide correct password? ... need more details.

Cheers,

Jiaying


Eric October 5, 2002 13:57

Re: Using SSH instead of RSH for parallel
 
Thank you for your response. I am running the job on several computers A-D. The model directory with the .mdl .geom and .prob is located on A and prohpc is run in this directory for creation of the subdirectories model_0001 through model_0008 (4 machines with 2 processors each) and the the deomposition of the model. From machine A the files and directories are then copied to the other machines B-D with appropriate licence and executables. Next I submit the job on A by invoking the model-run.sh script. At this point the run fails with the error massage "machineA connection refused" If the executable model.exe is screened for all strings it is found that the executable is still pointing at the location of the old version of mpich. We believe this to be the error, however where do we set the environment to change the path that this variable points? Would we need to re-install star, ie does it set this at installation?

steve October 6, 2002 11:26

Re: Using SSH instead of RSH for parallel
 
I haven't tried myself to work with ssh instead of rsh, but my guess is:

You don't have to reinstall star. You just have to get it to point to the right mpich installation both when it starlinks and when it runs. There are panels in prohpc that show you what variables it thinks are set for mpi_root, mpi_arch, etc. You should change these in prohpc (and as a last resort, just edit the parallel.inf file to force them to change if you can't find the right panel). It probably would be helpful to make sure that the 3 environmental variables are set in your own .cshrc or .login or .profile as well just to be absolutely sure that the executable on each node is picking them up correctly. Then go through the starlink step again to get a new executable, check the model-run.sh script to make sure that its right and run. If this fails, please send me an email and I will get someone to help you.

cjtune October 11, 2002 08:13

Re: Using SSH instead of RSH for parallel
 
I had this problem in Red Hat 7.2. rsh is turned off by default. I linked ssh to rsh symbolically, distributed the public keys to the other nodes and everything runs -I don't remember doing anything elaborate for this. One problem tho: It takes quite a long time to authenticate via ssh and this shows in ProHPC -nothing serious, just irritating.


All times are GMT -4. The time now is 10:45.