CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   Main CFD Forum (https://www.cfd-online.com/Forums/main/)
-   -   Parallel Processing (https://www.cfd-online.com/Forums/main/6380-parallel-processing.html)

bostandoust July 25, 2003 20:52

Parallel Processing
 
hi I have a cluster consist of 9 Pentum 4(2.4Ghz) with 100 Mbit/sec network. I developed a 2d parallel naveir stokes solver with petsc and metis.because of some reasons,I could not gain performance. I want to know that is it possible to gain performace with implicit methods and with a 100 Mbit/sec network? I wonder if you could help me in this matter.(inform me some papers in this matter). this is my email: mbostandoust@yahoo.com bye

andy July 27, 2003 11:49

Re: Parallel Processing
 
Yes it is (probably) possible although you have not said what algorithm you are using and what size grids you are using. Both have a big influence on efficiency. To get an appreciation of likely performance look at the results of NAS Parallel Benchmarks for similarish machines. Some points:

* big grids parallelise better than small grids because of a larger volume to surface area ratio. You stand no chance of running a small 2D grid efficiently on such a machine (However, 15 years ago on a 9 Inmos transputer system one could achieve over 90% efficiency for implicit ADI line sweeps for the flow in 2D ducts on modest sized grids. Progress?)

* to get the best efficiency one needs to tune the number and size of messages for your hardware and this usually requires (mild) algorithmic changes. By using an off-the-shelf solver you are limiting your options somewhat but petsc is a big package and may have such parameters (I have never used it).

* as delivered ethernet is usually not optimised for low latency. Check your nic manufacturer for parameters to improve the performance in this respect. Less than 40usec is something to aim for. Cheap switchs and cheap nics can be a performance problem. Some nics perform poorly with certain motherboards (Checking out your PC hardware performance/compatability is another disappointing aspect of current computing).

* if you are really keen (desperate?) one can improve latency further by using OS-bypass software. This is reported to get down to 10usec or so (when it works well) but I have no direct experience.


bostandoust July 27, 2003 13:25

Re: Parallel Processing
 
hi thanks for your comments. I developed the code with finite element with SUPG/PSPG equal order elements(collocated methods in finite volume). The most time consuming part of the code is solving sustem of equations in parallel.I used different solvers but I reach to the point that I should solve them with direct method or precondition the system of equation with LU and use GMRES or BCGSTAB for the iterative solutions.(because the system of equation are unsymmetric) currentlly there exist two good parallel direct solvers,superlu_dist and mumps. I used superlu_dist and I could not obtain performance in this regard. In the next week I will try mumps. I wonder to hear more comments or suggestions in this regard. bye

andy July 27, 2003 14:35

Re: Parallel Processing
 
What performance are you referring to? parallel efficiency or something else?

I have no experience of the parallel efficiency of off-the-shelf solvers but would suggest talking to the authors who are often keen for the results of their labours to be useful to others.


All times are GMT -4. The time now is 10:41.