CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   FLUENT (http://www.cfd-online.com/Forums/fluent/)
-   -   Parallel runs slower with MTU=9000 than MTU=1500 (http://www.cfd-online.com/Forums/fluent/46434-parallel-runs-slower-mtu-9000-than-mtu-1500-a.html)

Javier Larrondo October 28, 2007 23:30

Parallel runs slower with MTU=9000 than MTU=1500
 
Hi,

I have been trying to build a small cluster with 2 dual-core Pentium D pcs. I've installed SUSE SLES 10, the NIC cards are Gigabit. After two weeks struggling with the network configuration. I'm finally able to perform some benchmark.

The problems is that setting up the Jumbo Frames option on the NIC card (MTU=9000) my test case runs slower than the one with the NIC standard option (MTU=1500). Also I've seen that the MTU=9000 options don't use as much CPU than the standard option.

Does anyone have experience with this?

Any comments would be helpful. I need to improve this to request some extra funds for my research project and build a bigger beowulf cluster.

---- REFERENCE INFO ---------

Case: 464000 Hex Cells 3D, PBNS, RNG k-e, multiphase mixture model (2 phases), Multiple Reference frames, unsteady.

MTU=9000 (OPTION) Performance Timer for 40 iterations on 4 compute nodes

Average wall-clock time per iteration: 13.969 sec

Global reductions per iteration: 223 ops

Global reductions time per iteration: 0.000 sec (0.0%)

Message count per iteration: 854 messages

Data transfer per iteration: 30.742 MB

LE solves per iteration: 7 solves

LE wall-clock time per iteration: 5.445 sec

LE global solves per iteration: 2 solves

LE global wall-clock time per iteration: 0.085 sec (0.6%)

AMG cycles per iteration: 8 cycles

Relaxation sweeps per iteration: 316 sweeps

Relaxation exchanges per iteration: 76 exchanges

Time-step updates per iteration: 0.05 updates

Time-step wall-clock time per iteration: 0.015 sec (0.1%)

Total wall-clock time: 558.759 sec

Total CPU time: 1477.740 sec

MTU=1500 (OPTION) Performance Timer for 40 iterations on 4 compute nodes

Average wall-clock time per iteration: 7.700 sec

Global reductions per iteration: 223 ops

Global reductions time per iteration: 0.000 sec (0.0%)

Message count per iteration: 854 messages

Data transfer per iteration: 30.742 MB

LE solves per iteration: 7 solves

LE wall-clock time per iteration: 0.605 sec (7.9%)

LE global solves per iteration: 2 solves

LE global wall-clock time per iteration: 0.003 sec (0.0%)

AMG cycles per iteration: 8 cycles

Relaxation sweeps per iteration: 316 sweeps

Relaxation exchanges per iteration: 76 exchanges

Time-step updates per iteration: 0.05 updates

Time-step wall-clock time per iteration: 0.016 sec (0.2%)

Total wall-clock time: 308.003 sec

Total CPU time: 949.780 sec

Cheers,

Javier


All times are GMT -4. The time now is 22:45.