CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   OpenFOAM Running, Solving & CFD (https://www.cfd-online.com/Forums/openfoam-solving/)
-   -   Performence on Cluster (https://www.cfd-online.com/Forums/openfoam-solving/57848-performence-cluster.html)

bastil March 3, 2009 13:10

Dear forum, I have some job
 
Dear forum,

I have some jobs running on our Opteron Myrinet cluster. Convergence is fine but jobs are dam slow. I see nearly no speedup to runs on ethernet workstations and I am wondering if OpenMPI uses Mrinet quite fine. I have built it with mx-Support. We have an average of about 5 min/Iteration (about 10-15 Pressure steps per Iterations, fine) whereas FLUENT needs about 30 seconds for an iteration on same CPU-Number and same mesh.

cnsidero March 3, 2009 16:04

Are you sure the processes wer
 
Are you sure the processes were distributed to the cluster nodes and they are not all running the node you launched mpirun from?

If this is a linux cluster, log into one of the remote nodes you explicitly told them to run on and check the running processes using top or ps. If you are using a queuing/scheduling software (PBS, SGE, etc) find where it sent the processes and perform the preceding.

bastil March 3, 2009 16:58

Hi Chris, yes I did that. T
 
Hi Chris,

yes I did that. They run as they should. The only thing I am doing so far is not distributing the data to the local nodes but running this from a nfs-share. However, this should only influence writing-time of backup-data.
Difference to FLUENT is OpenMPI vs. HP-MPI.

Regards.

eugene March 4, 2009 10:16

OpenMPI will not have Myrinet
 
OpenMPI will not have Myrinet support by default. You will have to recompile OpenMPI with Myrinet support for it to work properly. Or just use HP-MPI, that works too (although you will have to buy a licence).

bastil March 4, 2009 10:41

Eugene, I know this I added
 
Eugene,

I know this I added the OpenMPI-Myrinet Support. If I run on our workstations (no Myrinet) I get an error about missing myrinet-modules. I do not get this error on our cluster where myrinet is present. However, performance is poor and I get not feedback (except missing error-message) if myrinet is used but I suppose not.

bastil March 5, 2009 04:59

Here are zwo Iterations from t
 
Here are zwo Iterations from the log:

Time = 71

DILUPBiCG: Solving for Ux, Initial residual = 0.000235829, Final residual = 4.2349e-06, No Iterations 2
DILUPBiCG: Solving for Uy, Initial residual = 0.00219142, Final residual = 6.32653e-05, No Iterations 2
DILUPBiCG: Solving for Uz, Initial residual = 0.00128352, Final residual = 1.65462e-05, No Iterations 2
GAMG: Solving for p, Initial residual = 0.00667156, Final residual = 5.49445e-06, No Iterations 9
time step continuity errors : sum local = 3.54209e-06, global = -1.48371e-07, cumulative = -0.00037855
DILUPBiCG: Solving for epsilon, Initial residual = 0.067249, Final residual = 1.9244e-10, No Iterations 1
bounding epsilon, min: -100901 max: 1.417e+09 average: 32636.2
DILUPBiCG: Solving for k, Initial residual = 2.33946e-06, Final residual = 2.33946e-06, No Iterations 0
ExecutionTime = 19028.4 s ClockTime = 19082 s

Time = 72

DILUPBiCG: Solving for Ux, Initial residual = 0.000234464, Final residual = 4.19113e-06, No Iterations 2
DILUPBiCG: Solving for Uy, Initial residual = 0.00216742, Final residual = 6.50005e-05, No Iterations 2
DILUPBiCG: Solving for Uz, Initial residual = 0.00127756, Final residual = 1.62209e-05, No Iterations 2
GAMG: Solving for p, Initial residual = 0.00666254, Final residual = 5.60993e-06, No Iterations 9
time step continuity errors : sum local = 3.61005e-06, global = -1.35679e-07, cumulative = -0.000378685
DILUPBiCG: Solving for epsilon, Initial residual = 0.0692427, Final residual = 2.30982e-10, No Iterations 1
bounding epsilon, min: -46613.7 max: 1.40629e+09 average: 32421.4
DILUPBiCG: Solving for k, Initial residual = 2.45671e-06, Final residual = 2.45671e-06, No Iterations 0
ExecutionTime = 19214.4 s ClockTime = 19268 s

This is more than 3 Minutes for one iteration. Is this ok for a case with about 26 Million cells running on 32 Opteron CPU 2220 with Myrinet-Interconnect. I feel it is much to slow..

Regards

BastiL

bastil March 5, 2009 05:22

Ok, this problem was caused by
 
Ok, this problem was caused by insufficient solver settings.

Regards

josp March 5, 2009 17:17

BastiL, would you mind sharing
 
BastiL, would you mind sharing what you had to change in the solver settings?

Regards

bastil March 5, 2009 17:38

Yes I had nIterFinestLevel for
 
Yes I had nIterFinestLevel for the preconditioner set to high value.


All times are GMT -4. The time now is 06:31.