CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Programming & Development (https://www.cfd-online.com/Forums/openfoam-programming-development/)
-   -   MPI Error in Custom Utility (https://www.cfd-online.com/Forums/openfoam-programming-development/241071-mpi-error-custom-utility.html)

jackdoubleyou February 7, 2022 06:43

MPI Error in Custom Utility
 
Hi everyone,


I've written a bespoke utility that will identify droplets and other structures in a VOF field, and want to parallelise it so that it can be incorporated into a multiphase solver. This parallelisation is almost complete, however I've run into some edge cases that I cannot solve.


Without going into endless detail, the parallelisation works by passing droplet IDs across processor boundaries using a volScalarField which tracks which droplet any cell is attached to. The utility will then write out a connectivity file for each processor, to enable any droplets that cross processor boundaries to be reconstructed as a post-processing step. My current issue seems to only occur if there is a processor domain which has no droplets requiring identification.


The following loop loops over all processor patches and sets the patch face value to be the same as the cell centre value if the cell value is greater than 0.



Code:

forAll(mesh.boundaryMesh(), patchi)
    {
        if (isA<processorPolyPatch>(mesh.boundaryMesh()[patchi]))
        {
            forAll(id.boundaryField()[patchi], facei)
            {
                label adjacentCell = mesh.boundaryMesh()[patchi].faceCells()[facei];

                if (id[adjacentCell] > 0)
                {
                    id.boundaryFieldRef()[patchi][facei] = id[adjacentCell];
                }
            }

            id.boundaryFieldRef()[patchi].initEvaluate(Pstream::commsTypes::nonBlocking);
            id.boundaryFieldRef()[patchi].evaluate(Pstream::commsTypes::nonBlocking);
        }
    }

The utility will crash at the initEvaluate and evaluate function calls, with an MPI Wait error:

Code:

[proteus:25213] *** An error occurred in MPI_Wait
[proteus:25213] *** reported by process [1815216129,7]
[proteus:25213] *** on communicator MPI_COMM_WORLD
[proteus:25213] *** MPI_ERR_TRUNCATE: message truncated
[proteus:25213] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
 [proteus:25213] ***    and potentially your MPI job)

I suspect the issue is that the processor which has no droplets reaches this section of the code much before the others, and for some reason this causes a crash, although I can think of no reason why it should crash for this reason.


In an attempt to synchronise the processors before this loop, I added the following reduce call, but this also throws an error.


Code:

label tmp = Pstream::myProcNo();
reduce(tmp, maxOp<label>());

Code:

[proteus:25771] Read -1, expected 86400, errno = 14
[proteus:25771] *** An error occurred in MPI_Recv
[proteus:25771] *** reported by process [4918845867728568321,7]
[proteus:25771] *** on communicator MPI_COMM_WORLD
[proteus:25771] *** MPI_ERR_TRUNCATE: message truncated
[proteus:25771] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[proteus:25771] ***    and potentially your MPI job)

Any thoughts as to what might be happening? I haven't been able to find anything useful based on the error messages, and the parallel implementation of OpenFOAM is something I'm fairly new to. I'm using OpenFOAM v5.0, compiled on SLED 15 SP3 using GCC v7.5.0 and OpenMPI v3.1.1


All times are GMT -4. The time now is 10:24.