CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > OpenFOAM Running, Solving & CFD

Sun Grid Engine

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree2Likes

Reply
 
LinkBack Thread Tools Display Modes
Old   March 3, 2008, 16:48
Default I finally able to run my case
  #21
Senior Member
 
Nishant
Join Date: Mar 2009
Location: Glasgow, UK
Posts: 165
Rep Power: 7
nishant_hull is on a distinguished road
I finally able to run my case in parallel. There was some problem in the gcc installation. Now its working fine,.

Thank you ..

Nishant
__________________
Thanks and regards,

Nishant
nishant_hull is offline   Reply With Quote

Old   March 12, 2008, 11:17
Default I thought my parallel case is
  #22
Senior Member
 
Nishant
Join Date: Mar 2009
Location: Glasgow, UK
Posts: 165
Rep Power: 7
nishant_hull is on a distinguished road
I thought my parallel case is running, but actually it was not. However I can see the program running on queue. Error file is saying that..
ERROR: A daemon on node comp26 failed to start as expected.
As i mentioned earlier my mpirun -hostfile machine <rooot> <case> -parallel command is running quite well on cluster directly. I mean to say that it's working fine if we run on master node (for us its kittyhawk.dcs.hull.ac.uk) but it fails on any other node ( like comp01/02/10/11 etc) I tried a hello mpi programm as well but that also failed to run using qsub and running quite well directly on master kittyhawk. my gcc compiler is unable to compile a program on any other node except master node kittyhawk. however they are using the right gcc (that is openfoam version of gcc)

Again, I am using cluster's version of MPICH as PE. (#$ mpich -pe 4), which is installed at /usr/.....**
The default PE environment here is >>score<< which we run using mpisub command.
Do I need to use a local version of mpich in order to run in parallel using qsub? Or could it be possible to run openfoam program using score?
can anybody suggst something?

regards,
Nishant
__________________
Thanks and regards,

Nishant
nishant_hull is offline   Reply With Quote

Old   April 17, 2009, 11:21
Smile Running OpenFOAM in parallel with SGE
  #23
4xF
New Member
 
Frank Albina
Join Date: Mar 2009
Location: Switzerland
Posts: 14
Rep Power: 7
4xF is on a distinguished road
Send a message via Skype™ to 4xF
Hi,

to run openFOAM in parallel with SGE, you need to make sure that the following requisites are satisfied:

1) use an openmpi version >= 1.2.0
The reason is that any version prior to that isn't working with SGE.

2) Make sure that you define a parallel environment, for instance "orte", with the following definition (that's here for 8 parallel slots = 8 cores in parallel):
pe_name orte
slots 8
user_lists NONE
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $round_robin
control_slaves TRUE
job_is_first_task FALSE

urgency_slots min
3) Submit your job with (for example with a run on 8 cores):
qsub RUN.sh
where RUN.sh contains:
#!/bin/sh
#$ -V
### number of processors and parallel environment
#$ -pe orte 8
### Job name
#$ -N "mypartest"
### Start from current working directory
#$ -cwd
### Generate the hostfile
HOSTFILE=system/hostfile
awk '{print $1" cpu=1"}' ${PE_HOSTFILE} > ${PWD}/${HOSTFILE}
### Run application
SOLVER=icoFoam
${MPI_ARCH_PATH}/bin/mpirun -np ${NSLOTS} --hostfile ${PWD}/${HOSTFILE} ${FOAM_APPBIN}/${SOLVER} -parallel
exit $?

You will also find further information at:
http://www.open-mpi.org/faq/?categor...run-scheduling

Alternatively, you can try to compile MPICH from source. I've been able to run v1.2.7p1 without any dramas. This is quite straightforward if you take a look at the Allwmake scripts in $WM_THIRD_PARTY.

Hope this helps...
atg likes this.
4xF is offline   Reply With Quote

Old   January 18, 2010, 08:10
Default Problem with openFoam and SGE
  #24
New Member
 
Cárdenas
Join Date: Sep 2009
Posts: 5
Rep Power: 6
carmir is on a distinguished road
Hello to all,

I'm trying to run openFoam on a SGE Sun Cluster. When running the job on parallel in a single node, everything works. But when trying to run the same job on different nodes, I get the following error message:

epsilon2.o31752:
PHP Code:
  1 Warningno access to tty (Bad file descriptor).
  
2 Thus no job control in this shell.
  
3 Host key verification failed.^M
  4 
--------------------------------------------------------------------------
  
5 A daemon (pid 21783died unexpectedly with status 255 while attempting
  6 to launch so we are aborting
.
  
7
  8 There may be more information reported by the environment 
(see above).
  
9
 10 This may be because the daemon was unable to find all the needed shared
 11 libraries on the remote node
You may set your LD_LIBRARY_PATH to have the
 12 location of the shared libraries on the remote nodes 
and this will
 13 automatically be forwarded to the remote nodes
.
 
14 --------------------------------------------------------------------------
 
15 --------------------------------------------------------------------------
 
16 mpirun noticed that the job abortedbut has no info as to the process
 17 that caused that situation
.
 
18 --------------------------------------------------------------------------
 
19 mpirunclean termination accomplished 
To submmit the job I'm using the command qsub with the following script:

------------------------------------------------------------------------------
#!/bin/tcsh
# This is a simple example of a SGE batch script
#$-o /nfs/home/cardenas/Documents/OpenFOAM/Cases/Platte/Laenge120mm/Pulsierend/eps ilon2 -j y
#$-N epsilon2
#$-pe batch_64_2 2
#$-S /bin/tcsh
touch $HOME/.ssh/known_hosts
cd /nfs/home/cardenas/Documents/OpenFOAM/Cases/Platte/Laenge120mm/Pulsierend/epsilon2
touch -a ./*.*
touch -a ./system/*
source /nfs/home/cardenas/OpenFOAM/OpenFOAM-1.6.x/etc/cshrc
cat $PE_HOSTFILE |awk '{ print $1 " cpu=" $2}' > $HOME/mpi/machines.LINUX.$JOB_ID
sleep 10;
mpirun --hostfile $HOME/mpi/machines.LINUX.$JOB_ID -np 2 icoFoam -parallel >log

-----------------------------------------------------------------------------------------------

It seems that something with the Host Keys is not working properly, but since I'm not expirienced in SGE, I would appreciete any suggestions and hints. Thank you very much

Alejandro
carmir is offline   Reply With Quote

Old   January 19, 2010, 03:10
Default
  #25
Senior Member
 
Mark Olesen
Join Date: Mar 2009
Location: http://olesenm.github.io/
Posts: 771
Rep Power: 17
olesen will become famous soon enough
Quote:
Originally Posted by carmir View Post
Hello to all,

I'm trying to run openFoam on a SGE Sun Cluster. When running the job on parallel in a single node, everything works.
There are a myriad of things that could be going wrong.
The very first thing it to determine if GridEngine support has been compiled into your openmpi.

Use the command "ompi_info" to list all the backends and grep for gridengine. If it's not there, you should recompile openmpi using the --with-sge configure option (see the third-party Allwmake).


Quote:
Originally Posted by carmir View Post
touch $HOME/.ssh/known_hosts
^^^ what is this? Touching a file into existence doesn't make the hosts known!

Quote:
Originally Posted by carmir View Post
cat $PE_HOSTFILE |awk '{ print $1 " cpu=" $2}' > $HOME/mpi/machines.LINUX.$JOB_ID
...
mpirun --hostfile $HOME/mpi/machines.LINUX.$JOB_ID -np 2 icoFoam -parallel >log
If you have GridEngine and the openmpi is configured to use it, you should not be using--hostfile or -np. The GridEngine already knows how many slots you have (which would be $NSLOTS in your script), and it knows the host names too. It should also take care of inheriting the environment as well.

If the final backend uses rsh, ssh, or the GridEngine builtin transport will depend on what you have configured as the 'rsh_command' and 'rsh_daemon' in GridEngine.

BTW: your example is using cshell. Be certain that the queue is configured with the corresponding shell_start_mode. Be default this will be 'posix_compliant' (ie, use /bin/sh) and not 'unix_behavior' (ie, use #! to determine the shell/program).
olesen is offline   Reply With Quote

Old   October 27, 2011, 10:03
Default Pending but not running
  #26
New Member
 
Join Date: Aug 2010
Posts: 7
Rep Power: 5
schteff is on a distinguished road
Hi,

I also tried to run OpenFoam in parallel with SGE.
I use the following script to submit the job:
Code:
#!/bin/csh
#$ -V
###set queue
#$ -q normal
### number of processors and parallel environment
#$ -pe OpenFOAM 4

#$ -S /bin/csh

### Job name
#$ -N "mypartest"
### Start from current working directory
#$ -cwd
 
source ./soft/OpenFOAM/OpenFOAM-2.0.0/etc/cshrc

### Run application

mpirun -np ${NSLOTS} pisoFoam -parallel
I get the following error:


Code:
xhost: Command not found.

: Command not found.

: Command not found.

: Command not found.

: Command not found.
/soft/OpenFOAM/OpenFOAM-2.0.0/etc/cshrc

: No such file or directory.


I don´t know why the grid engine can´t find the command.


Does anybody have an idea why it doesn’t work? Or are there any settings I have to modify?

I´m thankful for any help

Stefan


Last edited by schteff; November 21, 2011 at 04:26.
schteff is offline   Reply With Quote

Old   May 4, 2012, 11:56
Default
  #27
New Member
 
Ricardo Reis
Join Date: May 2012
Posts: 2
Rep Power: 0
rreis is on a distinguished road
Quote:
Originally Posted by 4xF View Post
3) Submit your job with (for example with a run on 8 cores):
qsub RUN.sh
where RUN.sh contains:
#!/bin/sh
#$ -V
### number of processors and parallel environment
#$ -pe orte 8
### Job name
#$ -N "mypartest"
### Start from current working directory
#$ -cwd
### Generate the hostfile
HOSTFILE=system/hostfile
awk '{print $1" cpu=1"}' ${PE_HOSTFILE} > ${PWD}/${HOSTFILE}
### Run application
SOLVER=icoFoam
${MPI_ARCH_PATH}/bin/mpirun -np ${NSLOTS} --hostfile ${PWD}/${HOSTFILE} ${FOAM_APPBIN}/${SOLVER} -parallel
exit $?
If you change the submit script to have

Code:
HOSTFILE=system/hostfile
awk '{print $1" cpu="$2}' ${PE_HOSTFILE} > ${PWD}/${HOSTFILE}
it will become more general. Nice hack thx
atg likes this.
rreis is offline   Reply With Quote

Old   May 4, 2012, 11:57
Default
  #28
New Member
 
Ricardo Reis
Join Date: May 2012
Posts: 2
Rep Power: 0
rreis is on a distinguished road
Quote:
Originally Posted by schteff View Post
Hi,

I also tried to run OpenFoam in parallel with SGE.
I use the following script to submit the job:
Code:
#!/bin/csh
#$ -V
###set queue
#$ -q normal
### number of processors and parallel environment
#$ -pe OpenFOAM 4

#$ -S /bin/csh

### Job name
#$ -N "mypartest"
### Start from current working directory
#$ -cwd
 
source ./soft/OpenFOAM/OpenFOAM-2.0.0/etc/cshrc

### Run application

mpirun -np ${NSLOTS} pisoFoam -parallel
I get the following error:


Code:
xhost: Command not found.

: Command not found.

: Command not found.

: Command not found.

: Command not found.
/soft/OpenFOAM/OpenFOAM-2.0.0/etc/cshrc

: No such file or directory.


I don´t know why the grid engine can´t find the command.


Does anybody have an idea why it doesn’t work? Or are there any settings I have to modify?

I´m thankful for any help

Stefan

what is the full path to the OpenFOAM dir? maybe just

/soft/OpenFOAM/OpenFOAM-2.0.0/etc/cshrc

without the initial . ?
rreis is offline   Reply With Quote

Old   August 22, 2012, 09:27
Default solution to "Host key verification failed"
  #29
New Member
 
Timo Kulju
Join Date: Aug 2009
Posts: 21
Rep Power: 0
tikulju is on a distinguished road
Hi!
If somebody is having problems with host-keys, adding a line
Code:
StrictHostKeyChecking no
at the end of
Code:
/etc/ssh/ssh_config
-file and restarting ssh-daemon from the computing nodes should fix it. Of course SGE has to know, that you're using ssh for communication instead rsh. This can be done by specifying
Code:
export OMPI_MCA_orte_rsh_agent=ssh
in the run script.
tikulju is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Grid Engine OpenFOAM15dev and OpenMPI124 tian OpenFOAM Installation 11 February 26, 2009 10:43
Running parallel job using qsub on sun grid engine nishant_hull OpenFOAM Running, Solving & CFD 5 February 7, 2008 14:52
IC engine Araz Banaeizadeh Main CFD Forum 0 June 28, 2006 22:56
CFX integration with Sun Grid Engine mausmi CFX 0 September 5, 2005 03:08
CFX and Sun Grid Engine David Hargreaves CFX 1 August 25, 2005 23:50


All times are GMT -4. The time now is 20:45.