parallel performance of chtMultiRegionFoam
I'm actually working on a multi region problem and use the chtMultiRegionFoam solver.
I observed strange behaviour when running in parallel.
On a single machine anything seems fine.
But when running the job over network the time to solve increases drastically.
To check if it is my set up I ran /tutorials/heatTransfer/chtMultiRegionSimpleFoam and got the same problem:
I changed the decomposeParDict in all regions and Allrun to run it on 2 cores. (On 4 the same when going over network )
Time to solve on 1 cores (single machine):155s
Time to solve on 2 cores (single machine):145s (scales better when increasing the model size)
Time to solve on 2 cores (on network-GBit connected machines ):3817s
I also checked the following:
* Notes on scalability: Large test case for running OpenFoam in parallel (http://www.cfd-online.com/Forums/ope...tml#post230087)
* And a whole list of links to threads on this subject: Notes about running OpenFOAM in parallel (http://www.cfd-online.com/Forums/blo...-parallel.html)
* This one also comes to mind: How to run concurrent MPI jobs within a node or set of nodes - post #9 (http://www.cfd-online.com/Forums/har...tml#post356954)
* Check the threads at http://www.cfd-online.com/Forums/hardware/ for more ideas as well.
NFS is working, and I made sure using the rigth interface (-mca btl_tcp_if_include eth0).
I crosschecked with the icoFOAM solver and have no issue.
Is the process inter-communication so expensive in chtMultiRegionFoam that it does not make sense to run it over network?
Is it possible to optimize the decomposition to make the model run distributed?
Thanks in advance.
Could someone please confirm or comment this ?
Is the only way out using 1.6-ext conjugateheatfoam ?
I am also having trouble with parallel run of chtMRF, in my case a slight difference is that i am using the solver with view factor radiation model. when I run on 2 processors of the same machine/node , the performance is worse than for on 1 processor.
my mesh is only 500,000. cells (0.5 million), but still.
I am suspecting that it is because of the way the domain is decomposed.
can you tell me which method you are using for domain decomposition in the decomposeParDict file ?
and how can we decompose the domain according to the regions we already have , like fluid region goes to 1processor and solid region to another ?
I used scotch as prepared in the tutorial in OF 2.1.1.
For the other questions I don't have but would like to have an answer.
Has there been any progress on this matter since?
|All times are GMT -4. The time now is 21:40.|