CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Main CFD Forum

Parallel computation

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   July 24, 2007, 14:16
Default Parallel computation
  #1
Shrinivas
Guest
 
Posts: n/a
Hi,

I am running a flow solver using MPI (i.e. parallel computing) The thing is that when its running it stops the calculation and the output file says

That a node(xxx): "waiting too long for completion"

can anyone tell me a solution to this. Has anyone encountered this problem before what is the remedy..

Thanks

Shrini
  Reply With Quote

Old   July 25, 2007, 11:39
Default Re: Parallel computation
  #2
agg
Guest
 
Posts: n/a
Could it be that the node xxx is waiting to receive data (blocking) from some other node, say yyy and there is no send posted by yyy to xxx ?

Use a debugger or insert print statements just before and after the receive statements to see where exactly it is getting stuck.
  Reply With Quote

Old   July 25, 2007, 11:47
Default Re: Parallel computation
  #3
Shrinivas
Guest
 
Posts: n/a
Thanks Agg,

Ok, Actually the solver runs for like ~10000 time steps and then the code does not respond/stalls and when i kill the job the output file displays that node(xxx) is waiting long for completion. I checked the SEND RECEIVE command too tht is doing fine. Is this anything to do with load balancing algorithm or something like that.

Also, when I check a particular status of a node....It echoes that the node is running and also the job is running...but again the output file from the code is not appended, this forces me to delete the job and the diagnosis report tells that 3 of four nodes are waiting to complete.

Thanks for the help,

br,

_shrini
  Reply With Quote

Old   July 26, 2007, 14:37
Default Re: Parallel computation
  #4
agg
Guest
 
Posts: n/a
Does the problem run to completion on one processor?

Load balancing may be a problem. However, why does the problem occur only after 10000 time steps? There must be some collective communication (e.g. time step calculation using allreduce) where all processors must wait at the end of each time step. The load balance problem should then be seen after each time step. You said you are using a flow solver. What variables are you computing? u,v,w,p,rho?
  Reply With Quote

Old   July 26, 2007, 15:00
Default Re: Parallel computation
  #5
Shrinivas
Guest
 
Posts: n/a
Thanks agg,

I mean it is not specific to 10000 time steps. This occurs abruptly. Also this problem does not occur always. I have encountered this problem 3/15 times that I have run the case.

Yes i am computing u,v,w rho and T. It is a incompressible flow solver with structured mesh blocks and unstructured decomposition of the blocks i.e.Adjacent blocks maybe oriented in a different way, i axis of one block coincides with j axis of another block.

Currently I am running 8 blocks on 8 processors and the load balancing is turned off.

thanks for everything

Shirnivas
  Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
problem in the CFX12.1 parallel computation BalanceChen ANSYS 2 July 7, 2011 10:26
Parallel computation using NUMECA 6.1 BalanceChen Fidelity CFD 1 June 5, 2011 06:24
Why the parallel computation is slow ztdep OpenFOAM Running, Solving & CFD 1 May 1, 2008 04:55
how to parallel computation Jane Siemens 2 April 28, 2004 06:11
Parallel computation problem in Tascflow dandy CFX 3 April 21, 2002 00:32


All times are GMT -4. The time now is 02:53.