Problem with mpirun with OpenFOAM
Recently I am having a strange problem with mpirun in OpenFOAM 1.6. I am running 3D LES of flow over a cylinder. I have 17 nodes (16 cells) in the spanwise direction (z-direction).
When I did a very coarse mesh (7500 cells per plane) in the xy plane, i can decompose the domain into 8 portions in the z-direction and run without any problem. When I did fine mesh (120,000 cells per plane) in the xy plane, I decompose the domain into 8 portions in the z-direction. After I executed the mpirun -np 8 pisoFoam -parallel, it complains the followings: // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // Create time Create mesh for time = 0 [shuang:8030] *** An error occurred in MPI_Bsend [shuang:8030] *** on communicator MPI_COMM_WORLD [shuang:8030] *** MPI_ERR_BUFFER: invalid buffer pointer [shuang:8030] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) -------------------------------------------------------------------------- mpirun has exited due to process rank 0 with PID 8030 on node shuang exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -------------------------------------------------------------------------- [shuang:08029] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal [shuang:08029] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages Then I decompose it into 2 portions along each the x, y (2 2 2 ) direction. It runs without this problem. If I wanna run the simulaiton with 8 portions along z-direction, what should I fix? Does anyone encounter such a problem as well? |
Hi,
I have the same problem. I'm running a simpleFoam simulation on an 8 core Workstation. I decomposed the mesh in 8 parts in x direction. The error doesn't appear every time. I read in another forum that there was/is a bug in openMpi that should be fixed. something about restarting checkpoints. would be great if someone could help. |
You need to raise the mpi buffer size. This can be done by running something like:
Code:
MPI_BUFFER_SIZE=150000000 |
Quote:
it works well after I increase the buffer size. |
All times are GMT -4. The time now is 00:18. |