davide_c March 23, 2012 09:47

Correct way to program a nonlinear cycle to run it in parallel
Hi everybody

I developed my own solver for some weird electrostatics+cfd problem, and i'm now facing some bugs in it when i run the solver with MPI.

the errors i obtain ( with both 1KK and 10K elements meshes ) look like the following:


[moon02-05:8604] *** An error occurred in MPI_Recv
[moon02-05:8604] *** on communicator MPI_COMM_WORLD
[moon02-05:8604] *** MPI_ERR_TRUNCATE: message truncated
[moon02-05:8604] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
mpirun has exited due to process rank 0 with PID 8604 on
node moon02-05 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).

I also tried to check if the standard solver give the same problem, I ran InterFoam with both the standard parallel tutorial case damBreakingFine (around 8K elements) and with a finest one (800K elements) and they work perfectly fine. Given all this, i think i may be neglecting something in my code to empty the buffers.

The structure of the code contains at first a (quite long to converge, 30-40 iterations in the first steps) Newton cycle so every time the solving for the potential variation is done, some message has to be buffered between the various processes.

does anybody know of a way to empty those buffers after the solution (if this is the issue , which i am not 100% sure about) or had any similar experience and found out what was going on or how to overcome this???

thanks in advance guys ;););)


Chris Lucas March 26, 2012 03:35


If I remenber correctly, this error means that a function in one domain your solving need an information out of another.

E.g.: You have a BC with a swirl and this BC needs the face center. The problem is that if you devided your domain in subdomains (processors), only one subdomain has the needed face center. Therefore, when the other processors try to get the face center they can't find it and they crash.

Best Regards,

davide_c March 26, 2012 05:34

Christian, many thanks for the tip, but I am not using anything like that you suggest... all the BCs are uniform (at least until when my code crashes), and the only RHS for the equation is computed from other fields (always cellwise, so it should not matter) and afterwards corrected by the correctBoundaryConditions() method... The part of my code where it normally crashes looks like this


Exp = P * exp ((phi_0 - phi) / refPhi);
phi.storePrevIter ();

for (int nonOrth=0; nonOrth<=nNonOrthCorr; nonOrth++)
    d_phi.storePrevIter ();
    Info << "\t\t.";

    if (
        solve (
              - fvm::laplacian(A,d_phi)
              + fvm::Sp(q*Exp/refPhi,d_phi)
              + q*Exp
              ).nIterations () == 0

    d_phi.relax ();

phi += d_phi;
phi.relax ();

and the variation d_phi has only uniform zero value or zero gradient conditions in the phisical boundary...
Do you think i need to pass/correct someway not just the BCs but the volume fields too?

Chris Lucas March 26, 2012 10:06


please check, but I guess the problem is in "phi.correctBoundaryConditions();". If you remove this code line, the solver should run.

Best Regards,

davide_c March 27, 2012 09:24

Christian, thank you again, but awfully, the problem is not in that line: I added it - under suggestion - while trying to fix the problem.

I also checked if it was by any chance the GAMG solver to make problems, but it is not (PCG and PBiCG crash sameway after one more cycle step).

I think i tried with almost every thing i can change...

