CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Community Contributions (https://www.cfd-online.com/Forums/openfoam-community-contributions/)
-   -   [cfMesh] cfmesh in parallel (MPI) (https://www.cfd-online.com/Forums/openfoam-community-contributions/172295-cfmesh-parallel-mpi.html)

Q.E.D. May 27, 2016 14:43

cfmesh in parallel (MPI)
 
Hello everyone!

I'm currently trying to run cfmesh (v1.1.1, cartesianMesh) in parallel. (MPI) But there occurs an error which I didn't manage to resolve so far. Maybe someone can help me?

The error I get is the following:

[node033:18375] *** An error occurred in MPI_Bsend
[node033:18375] *** reported by process [46912131891201,11]
[node033:18375] *** on communicator MPI_COMM_WORLD
[node033:18375] *** MPI_ERR_BUFFER: invalid buffer pointer
[node033:18375] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[node033:18375] *** and potentially your MPI job)

A similiar error has already been reported. (see: https://sourceforge.net/p/cfmesh/tickets/2/) But I cannot find the solution.

Thank you in advance for your support.
Arthur

franjo_j June 26, 2016 11:39

Quote:

Originally Posted by Q.E.D. (Post 602162)
Hello everyone!

I'm currently trying to run cfmesh (v1.1.1, cartesianMesh) in parallel. (MPI) But there occurs an error which I didn't manage to resolve so far. Maybe someone can help me?

The error I get is the following:

[node033:18375] *** An error occurred in MPI_Bsend
[node033:18375] *** reported by process [46912131891201,11]
[node033:18375] *** on communicator MPI_COMM_WORLD
[node033:18375] *** MPI_ERR_BUFFER: invalid buffer pointer
[node033:18375] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[node033:18375] *** and potentially your MPI job)

A similiar error has already been reported. (see: https://sourceforge.net/p/cfmesh/tickets/2/) But I cannot find the solution.

Thank you in advance for your support.
Arthur

Hi,

cfMesh run in parallel all the time. I guess that you are referring to MPI parallelisation.

It is not possible to understand much from what you posted here. Please provide a log file and an example that reproduces the problem (http://sscce.org/).

Regards,

Franjo

tom.opt January 27, 2020 06:26

Hi,
I'm having a similar issue.
I am using OpenFOAM v1912 and trying to generate an aircraft mesh.
I'm working on a cluster so i need to run it in parallel using mpi.
When I add a small number of refinement levels it works, however when I increase the refinement levels mpi crashes..






Here is a working meshDict



surfaceFile "ac.stl";

maxCellSize 0.2;

objectRefinements
{

ac3
{
type box;
cellSize 0.1;
centre (3.93106 0.998578 -0.613427);
lengthX 14;
lengthY 6;
lengthZ 6;
}

}
surfaceMeshRefinement
{
TE
{
additionalRefinementLevels 4;
surfaceFile "TE.stl";

}
nose
{
additionalRefinementLevels 3;
surfaceFile "nose.stl";

}
tails
{
additionalRefinementLevels 3;
surfaceFile "tails.stl";

}
wing
{
additionalRefinementLevels 2;
surfaceFile "wing.stl";

}
}



Here is a crashing meshDict:





surfaceFile "ac.stl";

maxCellSize 1;

objectRefinements
{
ac1
{
type box;
cellSize 0.5;
centre (20 0.998578 -0.613427);
lengthX 80;
lengthY 30;
lengthZ 30;
}
ac2
{
type box;
cellSize 0.2;
centre (10 0.998578 -0.613427);
lengthX 50;
lengthY 15;
lengthZ 15;
}
ac3
{
type box;
cellSize 0.1;
centre (3.93106 0.998578 -0.613427);
lengthX 14;
lengthY 6;
lengthZ 6;
}
ac4
{
type box;
cellSize 0.05;
centre (4.2 0.998578 -0.613427);
lengthX 8.9;
lengthY 2.5;
lengthZ 3;
}

}

Has there been a fix?

franjo_j January 27, 2020 07:33

Quote:

Originally Posted by tom.opt (Post 755704)
Hi,
I'm having a similar issue.
I am using OpenFOAM v1912 and trying to generate an aircraft mesh.
I'm working on a cluster so i need to run it in parallel using mpi.
When I add a small number of refinement levels it works, however when I increase the refinement levels mpi crashes..






Here is a working meshDict



surfaceFile "ac.stl";

maxCellSize 0.2;

objectRefinements
{

ac3
{
type box;
cellSize 0.1;
centre (3.93106 0.998578 -0.613427);
lengthX 14;
lengthY 6;
lengthZ 6;
}

}
surfaceMeshRefinement
{
TE
{
additionalRefinementLevels 4;
surfaceFile "TE.stl";

}
nose
{
additionalRefinementLevels 3;
surfaceFile "nose.stl";

}
tails
{
additionalRefinementLevels 3;
surfaceFile "tails.stl";

}
wing
{
additionalRefinementLevels 2;
surfaceFile "wing.stl";

}
}



Here is a crashing meshDict:





surfaceFile "ac.stl";

maxCellSize 1;

objectRefinements
{
ac1
{
type box;
cellSize 0.5;
centre (20 0.998578 -0.613427);
lengthX 80;
lengthY 30;
lengthZ 30;
}
ac2
{
type box;
cellSize 0.2;
centre (10 0.998578 -0.613427);
lengthX 50;
lengthY 15;
lengthZ 15;
}
ac3
{
type box;
cellSize 0.1;
centre (3.93106 0.998578 -0.613427);
lengthX 14;
lengthY 6;
lengthZ 6;
}
ac4
{
type box;
cellSize 0.05;
centre (4.2 0.998578 -0.613427);
lengthX 8.9;
lengthY 2.5;
lengthZ 3;
}

}

Has there been a fix?


If the problem was due to your meshDict, then the mesher would not work even without MPI.


I assume the problem comes from a limited MPI buffer size that is not large enough to handle all messages. You can increase the buffer size by setting an environment variable MPI_BUFFER_SIZE and keep on increasing the buffer size until it starts working.


Alternatively, you may adjust the buffer size by setting a variable in you $WM_PROJECT_DIR/etc/controlDict. Have a look here: https://www.openfoam.com/releases/op.../usability.php


Franjo

tom.opt January 27, 2020 09:26

Quote:

Originally Posted by franjo_j (Post 755713)
If the problem was due to your meshDict, then the mesher would not work even without MPI.


I assume the problem comes from a limited MPI buffer size that is not large enough to handle all messages. You can increase the buffer size by setting an environment variable MPI_BUFFER_SIZE and keep on increasing the buffer size until it starts working.


Alternatively, you may adjust the buffer size by setting a variable in you $WM_PROJECT_DIR/etc/controlDict. Have a look here: https://www.openfoam.com/releases/op.../usability.php


Franjo




Thanks.
I updated the value of the buffer in/OpenFOAM/OpenFOAM-v1912/etc/controlDict


I went up to 900 000 000

I rerun the program. But it still seems to crash?

Is there a recommended number of cores that i should use per million cells of mesh size?
i'm planning to generate something of the order 100mil and i'm using 64 cores

franjo_j January 27, 2020 09:41

Quote:

Originally Posted by tom.opt (Post 755727)
Thanks.
I updated the value of the buffer in/OpenFOAM/OpenFOAM-v1912/etc/controlDict




I rerun the program. But it still seems to crash?

Is there a recommended number of cores that i should use per million cells of mesh size?
i'm planning to generate something of the order 100mil and i'm using 64 cores


cfMesh uses shared-memory parallelization (SMP) by default and MPI is used optionally. MPI is available for cartesianMesh, only.

For example, if there are 64 cores available on a single node, there is no need to use MPI. The code will not run any faster because it uses all cores by default.

Using MPI makes sense in two cases:
1. The desired number of cores is distributed over a given number of nodes.
2. There is not enough memory on a single node.

When using MPI, the number of MPI processes shall be equal to the number of nodes, not the number of cores. Cores are used by default on each node.

tom.opt January 28, 2020 04:21

Quote:

Originally Posted by franjo_j (Post 755729)
cfMesh uses shared-memory parallelization (SMP) by default and MPI is used optionally. MPI is available for cartesianMesh, only.

For example, if there are 64 cores available on a single node, there is no need to use MPI. The code will not run any faster because it uses all cores by default.

Using MPI makes sense in two cases:
1. The desired number of cores is distributed over a given number of nodes.
2. There is not enough memory on a single node.

When using MPI, the number of MPI processes shall be equal to the number of nodes, not the number of cores. Cores are used by default on each node.

Thank you very much

My hpc architecture was 16 cores per node so once i adjusted that(ie set the number of domains to 4 in decomposeParDict), and also increased the buffer size as previously advised, I managed to get it to run smoothly


All times are GMT -4. The time now is 17:04.