CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > CFX

multiple parallel jobs on one machine

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
Old   December 16, 2010, 11:10
Default multiple parallel jobs on one machine
  #1
New Member
 
Joey Bernard
Join Date: Dec 2010
Posts: 1
Rep Power: 0
joeybernard is on a distinguished road
I have some users who are here and running into a really odd problem. We have a 600+ core cluster that they are running on, and parallel MPI jobs run fine most of the time. But, if a user submits two jobs that have some of their slots on the same nodes end up behaving very strangely. As a more detailed example, let's say that a user submits 2 10-core jobs and they get divided up across the nodes like this

node1 - 5-cores for A
node2 - 5-cores for A and 5-cores for B
node3 - 5-cores for B

If something like this happens, then weird things happen to the runs. If, instead, the two jobs don't share any nodes at all, they both run fine. Any ideas of what may be happening? This is using the bundled HP-MPI, by the way.

Joey
joeybernard is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Script to Run Parallel Jobs in Rocks Cluster asaha OpenFOAM Running, Solving & CFD 12 July 4, 2012 22:51
OpenFOAM static build on Cray XT5 asaijo OpenFOAM Installation 9 April 6, 2011 12:21
Random machine freezes when running several OpenFoam jobs simultaneously 2bias OpenFOAM Installation 5 July 2, 2010 07:40
Distributed Parallel on dual core remote machine Justin CFX 1 February 3, 2008 18:23
running multiple Fluent parallel jobs Michael Bo Hansen FLUENT 8 June 7, 2006 08:52


All times are GMT -4. The time now is 02:44.