problems using snappyHexMesh 2.1.0 on a supercomputer
Hi foamers,
I'm using snappyHexMesh on a supercomputer, but there is something wrong that I can't solve. Can anyone give me some hints? (OpenFoam-2.1.0 and red-hat 5.3) here is the log file for snappyHexMesh: Code:
srun.sz: job 5712297 queued and waiting for resources |
Greetings Sunxing,
This is a possible source of major problems: Code:
Overall mesh bounding box : (562162.0001 3202162 0) (566161.9999 3206162 5000) In addition, if possible, upgrading to OpenFOAM 2.1.1 or 2.1.x might fix some problems that existed in snappyHexMesh back in 2.1.0. Best regards, Bruno |
Quote:
Thanks for your reply. I forgot to metion that this error just happen when it was parallel processing. If i do it without parallel there is nothing wrong, while it is very time consuming. regards, sunxing |
Hi Sunxing,
Parallel? But the output text you provided clearly states that only one processor was being used: Quote:
In addition, on the line "Exec:" it does not show the "-parallel" option, therefore it's very unlikely this was executed in parallel. Best regards, Bruno |
Hi Bruno,
Sorry that i offered the wrong log file. It should be: Code:
/*---------------------------------------------------------------------------*\ sunxing |
Hi Sunxing,
OK, I've done a bit of searching here in the forum and one recurring situation with this kind of error message is usually related to the MPI toolbox or how OpenFOAM is connecting to it:
Best regards, Bruno |
2 Attachment(s)
Hi Bruno,
Thanks very much for your patience and the full answer. Following your advise, I have tested the same case with 8 processors and it failed again. But for another case of less girds it succeed with 128 processors. Here are the snappyHexMeshDict and log file: Best regards sunxng |
Hi Sunxing,
You forgot to attach the file "include/snappyPara", which apparently has the settings for the few important variables I was looking for, which might be the source of the problem. By the way, do you know which MPI toolbox is being used? The following commands should give you some feedback: Code:
echo $FOAM_MPI I did some more searching for the latest error message you got, namely: Code:
In: PMI_Abort(1, Fatal error in MPI_Recv: Perhaps the system administrators of the supercomputer you're using have updated the MPI toolbox and not rebuilt OpenFOAM 2.1.0 with that new version of the MPI? Best regards, Bruno --------------------- edit: I went looking for where the crash is likely to be occurring and according to the last output message given to you by snappyHexMesh, this seems to be the problem: https://github.com/OpenFOAM/OpenFOAM...nement.C#L1835 Quote:
|
1 Attachment(s)
Hi Bruno,
I don't know much about the MPI thing. I tried the commands you offered. If I source the OF-2.1.0, then the feedback is Code:
[envision_lijun@sn02 sector_0]$ echo $FOAM_MPI Code:
[envision_lijun@sn02 sector_0]$ echo $FOAM_MPI Maybe there are some errors in the MPI, but i don't know how to fix them and either has the permission to change something about MPI. So maybe i should seek for other method to generate grids. However, I want to know do the MPI errors have bad effects on my calculation progress? Best regards, sunxing |
Hi Sunxing,
OK, you're limiting the maximum local cells to 10 million cells and the global to 30 million. This can lead to unexpected mesh refinements, which could imply why this error is occurring. In addition, I took another look at the output in one of your posts and there is this: Quote:
Try increasing the maximum limit to 60-90 million cells. The final mesh seems to only end up with roughly half of that. As to answer your recent post: Quote:
Code:
mpiexec --version Problem is that when "FOAM_MPI=mpi", this means that the MPI located at "/opt/mpi" is the one meant to be used. More specifically, you can try running: Code:
echo $MPI_ARCH_PATH Quote:
Quote:
Quote:
Long answer: It depends. MPI related errors can be due to a variety of reasons, including incorrect use or a situation that was not contemplated in a specific MPI version. For example, what comes to my mind is that some MPI libraries might not take into account what should be done if an array of 0 size is sent through the MPI channels; it doesn't make much sense trying to send an array of 0 dimension via MPI, but this could work in some MPI toolboxes but not on others. Therefore, without more details, I can only suppose that perhaps you'll only have problems with this MPI toolbox in OpenFOAM 2.1.0 if and only if there is an identical situation in the solver, to the one you're triggering with snappyHexMesh. In other words, the solver can only work well in parallel if the mesh is good enough to use it in parallel. Best regards, Bruno |
All times are GMT -4. The time now is 10:59. |