'Segmentation fault: 11' on running application in parallel
I'm having big trouble with 'Segmentation fault: 11', and I'd like to know if anyone else is experiencing something similar. I've searched the forum for answers but came up empty handed so here goes:
In a nutshell: I can run my case decomposed into 8 parts, but get nasty segfaults when I try with 16 parts. Decomposition is done using the 'simple'-method. The attached runLogs show output of both the 8-procs-run and the 16-procs-run, with the included error printed by terminal.
Further info: My jobs are sent to a high performance computing center, where the vast majority of nodes has 8 processors. I thought this could have something to do with it, as decomposePar needs hosts names of all the machines that the case is distributed to if it is not run one multiprocessor machine, but what confuses me is that I've previously been able to run the exact same case on 16 processors, on that same computing center, without error.
My second thought was that decomposePar splits the mesh into even parts, cutting straight through intersecting cells, consequently creating tiny cells in the decomposed domains, which would make the Courant Number explode in those cells (Courant Number is inversely proportional to cell volume so it would essentially be like dividing by zero, hence the segfault). However, I do not expect this to be the case as inspection indicates that it preserves each cell volume.
I'm running OF 2.2.0.
Yes, it is problem with OF cutting algorithm. There is no exact solution to this problem, However what you can do is, play around decomposition methods (scotch, simple, hierarchical ....) then you may solve your problem.
Thanks for the quick response! I see you've had a similar problem (guess I didn't search the forum thoroughly enough).
How did you arrive at this conclusion? And do you have any tips and tricks on how to decompose successfully? I'm simulating on a pretty fine mesh, so a trail-and-error approach would take for ever, I'm afraid.
|All times are GMT -4. The time now is 13:09.|