CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > OpenFOAM Running, Solving & CFD

Parallel run crashing

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
Old   December 4, 2008, 09:15
Default Dear all, I have a problem w
  #1
Member
 
matej forman
Join Date: Mar 2009
Location: Brno, Czech Republic
Posts: 92
Rep Power: 8
matejfor is on a distinguished road
Dear all,
I have a problem with a simulation crashing in parallel.

solver: oodles (OF 1.5.x)
what works: simulation runs in single, runs in parallel on local quad core

what is the problem: distributed parallel using mpich (tha latest) and SGE for distribution on dual and quad core LINUX cluster crashes after several timesteps. Not dependent on saving data.

crash happens with different solvers and geometries. Other codes using the same mpich has no problem. The crash is not a problem of convergence or solver, but a mpich communication between the different nodes.

Has anyone been facing the same problem? Any extra tweaking of mpich for openfoam?

any hint is welcomed.
thanks
matej
matejfor is offline   Reply With Quote

Old   May 10, 2009, 20:05
Default
  #2
New Member
 
Chun-Ho Liu
Join Date: May 2009
Location: Hong Kong
Posts: 2
Rep Power: 0
liuch is on a distinguished road
Dear Matej,

We're facing the same problem on our 8-core xeon linux cluster. MPICH works very well on one single CPU but fails when >1 CPU. It seems a sys. problem & we're working with the Sys Admin try to solve it. Thanks for sharing.

Best regards,
Chun-Ho
liuch is offline   Reply With Quote

Old   May 11, 2009, 02:09
Default
  #3
Member
 
matej forman
Join Date: Mar 2009
Location: Brno, Czech Republic
Posts: 92
Rep Power: 8
matejfor is on a distinguished road
Hi,

how we have solved the problem was to run it on a one machine, meanwhile our admin upgraded the kernels on the nodes and the problem somehow disappeared. Now we can run distributed, but with a very small effectivity, but it seems to be a network related problem.

matej
matejfor is offline   Reply With Quote

Old   May 11, 2009, 02:18
Default
  #4
New Member
 
Chun-Ho Liu
Join Date: May 2009
Location: Hong Kong
Posts: 2
Rep Power: 0
liuch is on a distinguished road
Hi Matej,

Thanks for your reply. Yes, I pretty sure it's system related & directly connected to the network/MPI setting. We have another set of OpenFOAM using OpenMPI on dual-processor, quad-core machine but we never have such "SAE" problem.

Our sys. adm. is currently busy with some othe stuff. We'll work together later this week.

Best regards,
Chun-Ho
liuch is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
CFX Mesh crashing Blade CFX 3 December 30, 2008 11:30
Solver crashing with error kasim CFX 6 February 11, 2008 07:17
Solver crashing with error Usman CFX 2 January 24, 2008 07:59
Ngeom Crashing on Trimmed / Hexahedral Mesh Fr Ted Crilly CD-adapco 2 October 13, 2005 07:08
Crashing GAMBIT with large .dbs filesize OPK FLUENT 3 April 15, 2005 08:35


All times are GMT -4. The time now is 09:19.