CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > ANSYS > CFX

Running CFX parallel distributed Under linux system with loadleveler queuing system

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   December 21, 2014, 03:33
Default Running CFX parallel distributed Under linux system with loadleveler queuing system
  #1
New Member
 
ahmad bakri
Join Date: May 2010
Posts: 20
Rep Power: 15
ahmadbakri is on a distinguished road
Hi all,


I have a problem, I need to run simulation under linux system and using loadleveler queuing system. I have to use the batch mode and some run file.cmd and then use llsubmit. Any idea how would the batch ( run ) file would look like. I was trying similar scripts like the one below, but got no luck. I am able to run local parallel like this:



#@output = $(jobid).out
#@ error = $(jobid).err
#@ job_type = parallel
#@ node = 1
##@ tasks_per_node = 2
#@ total_tasks=4
##@ node_usage = shared
#@ class = medium
##@ cpu_limit=900:00:00, 900:00:00
#@ queue
NP=$(cat $LOADL_HOSTFILE | wc -l)

cfx5solve -part $NP -start-method "PVM Local Parallel" -def "NormalMeshCase1.def"




However, with distributed parallel using following script, I am not able

#@output = $(jobid).out
#@ error = $(jobid).err
#@ job_type = parallel
#@ node = 2
##@ tasks_per_node = 2
#@ total_tasks=4
##@ node_usage = shared
#@ class = medium
##@ cpu_limit=900:00:00, 900:00:00
#@ queue
NP=$(cat $LOADL_PROCESSOR_LIST | wc -l)



cfx5solve -part $NP -start-method "PVM Distributed Parallel" -par-dist $LOADL_PROCESSOR_LIST -def "NormalMeshCase1.def"


I tried to run directly through solver but I got error msg



An error has occurred in cfx5solve: |
| |
| Remote connection to kuauhccn06 (ku-auh-ccn06) exited with return |
| code 1. It gave the following output: |
| |
| connect to address 172.30.102.6 port 544: Connection refused |
| connect to address 172.30.102.6 port 544: Connection refused |
| trying normal rsh (/usr/bin/rsh) |
| ku-auh-ccn06.kustar.ac.ae: Connection refused |
| |
| Check that you have typed the hostname correctly, that you have an |
| account "ahmad.albakri" on the specified host with permission to |
| rsh from this host, and that (particularly for Windows hosts) it |
| is running an rsh daemon. You can use the following command to |
| check the connection to a UNIX machine: |
| |
| rsh ku-auh-ccn06 uname |
| |
| or the following command if it is a Windows machine: |
| |
| rsh ku-auh-ccn06 cmd /c echo working |
+--------------------------------------------------------------------+


+--------------------------------------------------------------------+
| An error has occurred in cfx5solve: |
| |
| The architecture string for host kuauhccn06 (ku-auh-ccn06) could |
| not be determined. |
+--------------------------------------------------------------------+


+-------------------------------------------------------------------

The second question I have, if am not able to run directly through solver ( interactive method) , would I still have chance to run using batch file mode ?

ANY HELP PLS
ahmadbakri is offline   Reply With Quote

Old   December 21, 2014, 04:19
Default
  #2
Super Moderator
 
Glenn Horrocks
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 17,703
Rep Power: 143
ghorrocks is just really niceghorrocks is just really niceghorrocks is just really niceghorrocks is just really nice
You might need to talk to somebody more familiar with loadleveller to fix this.

But one thing I can think of which is worth looking at is that loadleveller might be running using a different user account name, and that account might not have permissions to run rsh. I would check what account loadlleveller is trying to use when it starts a job.
ghorrocks is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
RSH problem for parallel running in CFX Nicola CFX 5 June 18, 2012 18:31
Problems on running cfx in parallel Nan CFX 1 March 29, 2006 04:10
Distributed parallel runs on ANSYS CFX 10 Manoj Kumar CFX 4 January 25, 2006 08:00
Distributed parallel error in CFX 5.5.1 bogesz CFX 6 January 27, 2003 18:22
CFX, NT parallel, Linux, best platform Heiko Gerhauser CFX 1 August 21, 2001 09:46


All times are GMT -4. The time now is 11:44.