|
[Sponsors] |
Problems running a customized solver in parallel |
![]() |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
![]() |
![]() |
#1 |
Member
Rohith
Join Date: Oct 2012
Location: Germany
Posts: 57
Rep Power: 14 ![]() |
Hi All
I have been trying to run a new developed solver in parallel which generally delivers the same message meaning the evolution of the second process is broken down. But i have checked my MPI installation which is really fine and also runs some other standard tutorials in parallel. The decompostionPar file looks fine, as the complexity of my geometry is very low as it is square. However i have tried to used different decomposition schemes in segregating the geometry. Can somebody clarify me where actually does these errors arise from. Note : I am trying to run it on a normal desktop (4 parallel). Thanks in Advance Rohith Code:
Courant Number mean: 0 max: 0 Interface Courant Number mean: 0 max: 0 Time = 0.01 MULES: Solving for alpha1 MULES: Solving for alpha1 Liquid phase volume fraction = 0.7475 Min(alpha1) = 0 Min(alpha2) = 0 MULES: Solving for alpha1 MULES: Solving for alpha1 Liquid phase volume fraction = 0.7475 Min(alpha1) = 0 Min(alpha2) = 0 diagonal: Solving for rho, Initial residual = 0, Final residual = 0, No Iterations 0 diagonal: Solving for rhoCp, Initial residual = 0, Final residual = 0, No Iterations 0 diagonal: Solving for rhoHs, Initial residual = 0, Final residual = 0, No Iterations 0 GAMG: Solving for T, Initial residual = 1, Final residual = 0.0007007297, No Iterations 1 Correcting alpha3, mean residual = 2.9046116e-09, max residual = 0.0010659991 GAMG: Solving for T, Initial residual = 1.3759125e-05, Final residual = 3.8207349e-08, No Iterations 2 Correcting alpha3, mean residual = 2.233146e-09, max residual = 0.00081956968 [rohith-ESPRIMO-P700:10520] *** An error occurred in MPI_Recv [rohith-ESPRIMO-P700:10520] *** on communicator MPI_COMM_WORLD [rohith-ESPRIMO-P700:10520] *** MPI_ERR_TRUNCATE: message truncated [rohith-ESPRIMO-P700:10520] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort -------------------------------------------------------------------------- mpirun has exited due to process rank 2 with PID 10520 on node rohith-ESPRIMO-P700 exiting improperly. There are two reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -------------------------------------------------------------------------- [rohith-ESPRIMO-P700:10517] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal [rohith-ESPRIMO-P700:10517] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages |
|
![]() |
![]() |
![]() |
![]() |
#2 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 ![]() ![]() ![]() ![]() ![]() ![]() |
Greetings Rohith,
I've moved your post from the other thread: http://www.cfd-online.com/Forums/ope...ntroldict.html - because it wasn't a similar problem ![]() The problem you're getting is that there is a problem on the receiver end in one of the processes, because the message was truncated. That usually refers to either there not being enough memory available for the data transfer to be performed safely or there was possibly an error in the network connection. Without more information about the customizations you've made, it's almost impossible to diagnose the problem. All I can say is that at work we had a similar problem sometime ago and the problem was that we weren't using enough "const &" variables for keeping a local copy of scalar fields; instead, we always called the method that calculated and gave us the whole field, which was... well... bad programming, since it had to calculate a lot of times the whole field for the whole mesh, just to give us one result for a single cell ![]() Suggestions:
Best regards, Bruno
__________________
|
|
![]() |
![]() |
![]() |
![]() |
#3 |
Member
Thamali
Join Date: Jul 2013
Posts: 67
Rep Power: 13 ![]() |
Hi,
I have a similar problem in a solver developed using "simpleIBFoam" in foam-extend-4.0. This happens only when the follwing part is added to the UEqn.H Code:
-fvc::div(mu*dev2GradUTranspose) Code:
volTensorField dev2GradUTranspose =dev2(fvc::grad(U)().T()) Code:
tmp<fvVectorMatrix> UEqn ( fvm::div(phi,U) -fvm::laplacian(mu,U) -fvc::div(mu*dev2GradUTranspose) ); UEqn().relax(); solve(UEqn() == -fvc::grad(p)); Code:
[b-cn0105:506766] *** An error occurred in MPI_Recv [b-cn0105:506766] *** reported by process [784990209,0] [b-cn0105:506766] *** on communicator MPI_COMM_WORLD [b-cn0105:506766] *** MPI_ERR_TRUNCATE: message truncated [b-cn0105:506766] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [b-cn0105:506766] *** and potentially your MPI job) I'm stucked in this long time. Please see whether someone can help!! Thanks in advance. Thamali |
|
![]() |
![]() |
![]() |
![]() |
#4 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 ![]() ![]() ![]() ![]() ![]() ![]() |
Quick answer: My guess is that you should not use "dev2" independently from the whole equation. In other words, "dev2GradUTranspose" should not be used like that. You should instead code it directly like this:
Code:
tmp<fvVectorMatrix> UEqn ( fvm::div(phi,U) -fvm::laplacian(mu,U) -fvc::div(mu*dev2(fvc::grad(U)().T())) );
__________________
|
|
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
[ANSYS Meshing] Help with element size | sandri_92 | ANSYS Meshing & Geometry | 14 | November 14, 2018 08:54 |
Problem running in parralel | Val | OpenFOAM Running, Solving & CFD | 1 | June 12, 2014 03:47 |
rhoCentralFoam solver with Slip BCs fails in Parallel Only | JLight | OpenFOAM Running, Solving & CFD | 2 | October 11, 2012 22:08 |
Customized solver to run in parallel | hsieh | OpenFOAM Running, Solving & CFD | 3 | September 21, 2006 05:59 |
Coupled problem running in parallel | liu | OpenFOAM Running, Solving & CFD | 1 | June 24, 2005 06:57 |