CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > CFX

Distributed parallel error in CFX 5.5.1

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
Old   January 25, 2003, 10:21
Default Distributed parallel error in CFX 5.5.1
  #1
bogesz
Guest
 
Posts: n/a
Dear All, I'm trying to run in distributed parallel mode on two pc-s (dual P4 2.4 GHz 2GB RAM each, Suse 8.1 Linux), using 8 partitions (4 on each machine), with a quite large model at the second step solver gives the following error time and time again. What should be done? Already had succesful calculations with the same conditions and a bigger model (more mesh elements)

OUTER LOOP ITERATION = 2 CPU SECONDS = 1.44E+03

----------------------------------------------------- |Equation| Rate | RMS Res | Max Res | LinearSolution +----------------------+------+---------+---------+ +-------------------------------------------------- | ERROR #001100279 has occurred in subroutine ErrAction. | | Message: | Floating point exception: Type Unknown | +-------------------------------------------------+ +---------------------------------------------+ | ERROR #001100279 has occurred in subroutine ErrAction. |Message: | Stopped in routine c_fpx_handler | +---------------------------------------+ An error has occurred in cfx5solve:

The CFX-5 solver has terminated without writing a results file.

End of solution stage.

This run of the CFX-5 Solver has finished.

Any help would be appreciated
  Reply With Quote

Old   January 25, 2003, 13:35
Default Re: Distributed parallel error in CFX 5.5.1
  #2
Mike
Guest
 
Posts: n/a
This doesn't look like anything to do with running in parallel, the solver has just overflowed. If you already have a solution on a finer mesh, then i'd suggest interpolating that solution onto your courser mesh (Tools > Interpolate i think from the Solver MAnager menu). This will give you a much better initial guess and it's much less likely that the solver will fail. Mike
  Reply With Quote

Old   January 25, 2003, 17:26
Default still Distributed parallel error in CFX 5.5.1
  #3
bogesz
Guest
 
Posts: n/a
still the same... any idea to solve this? THX What I don't understand is that the model is almost the same only with minor geometry modifcations as the other one. with the other one had no problems at all.

================================================== ==

OUTER LOOP ITERATION = 2 CPU SECONDS = 1.29E+03

| Equation | Rate | RMS Res | Max Res | Linear Solution |

+----------------------+------+------

Parallel run: Received message from slave

-----------------------------------------

Slave partition : 5

Slave routine : ErrAction

Master location : RCVBUF,MSGTAG=1033

Message label : 001100279

Message follows below - :

+------------------------------------------

| ERROR #001100279 has occurred in subroutine

ErrAction.

| Message: |

| Floating point exception: Type Unknown |

|

+-----------------------------------------+

Parallel run: Received message from slave

-----------------------------------------

Slave partition : 5

Slave routine : ErrAction

Master location : RCVBUF,MSGTAG=1033

Message label : 001100279

Message follows below - :

+----------------------------+

| ERROR #001100279 has occurred in subroutine ErrAction. |

Message: |

| Stopped in routine c_fpx_handler

+-----------------------+

An error has occurred in cfx5solve:

The CFX-5 solver has terminated without writing a results file.

End of solution stage.

This run of the CFX-5 Solver has finished.
  Reply With Quote

Old   January 26, 2003, 12:05
Default Re: still Distributed parallel error in CFX 5.5.1
  #4
bogesz
Guest
 
Posts: n/a
OK I found it... Tell me, who's the stupid me or CFX: I had to slightly modify the geometry. I had thin surfaces, build generates 2 entries (for "both sides") for thin surf-s, as one can check it in post. But how come after my modifications - wich had nothing with the actual thin surfs -, it associates two absolutely diffrent surfaces for the second entry of the original thin surface...????? ... wich were actually exterior walls by default

anyway...
  Reply With Quote

Old   January 26, 2003, 16:19
Default Re: Distributed parallel error in CFX 5.5.1
  #5
Pascale Fonteijn
Guest
 
Posts: n/a
Can you explain me why you use more partitions (8) than you have in processors (4)?

Pascale
  Reply With Quote

Old   January 27, 2003, 06:46
Default Re: Distributed parallel error in CFX 5.5.1
  #6
Bob
Guest
 
Posts: n/a
Yeh I didn't follow that part either. I always thought you had one partition per CPU ?? is this incorrect thinking ??
  Reply With Quote

Old   January 27, 2003, 19:22
Default Re: Distributed parallel error in CFX 5.5.1
  #7
bogesz
Guest
 
Posts: n/a
physically 1 CPU is logically 2. I don't understand it (this is our experience both on Suse Linux & WinXP) either (or does anyone?) and we experienced it a bit faster with 2 partitions/1physical processor

and a question about that: under win NT 4, with a P4 1.8 GHZ and 2 GB RAM, solver says "the problem does not fit in memory" when I start a model with 2,8 millions of mesh elements in SERIAL. Starting in LOCAL PARALLEL with 2 partitions it goes well though doesn't exceed the physical memory limit (using 1.8 GB out of 2) any explanation?

lot of THX

Bog
  Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
RSH problem for parallel running in CFX Nicola CFX 5 June 18, 2012 18:31
Core usage on CFX parallel processing alterego CFX 6 December 21, 2011 06:45
CFX command line to activate the HP MPI Distributed... mohammad CFX 3 July 7, 2011 10:34
CFX local parallel on windows XP frank CFX 12 April 24, 2008 07:26
CFX, NT parallel, Linux, best platform Heiko Gerhauser CFX 1 August 21, 2001 09:46


All times are GMT -4. The time now is 08:31.