CFD Online Discussion Forums

CFD Online Discussion Forums (
-   Main CFD Forum (
-   -   cluster - parallel speedup (

George March 25, 2005 07:00

cluster - parallel speedup

I'm looking for experience of somebody who is doing CFD computations on a cluster made of PCs, connected by fast Ethernet. Can you say what is the lower limiting size of problem in terms of speedup when splitting a simulation between nodes?

On a 64-bit computing server with shared memory one has almost linear increase in computing speed when the simulation is split among processors. But on a cluster, when the size of the problem is relatively small, the network traffic would soon become a bottleneck of further speedup if splitting the simulation on more processors.

Have you this experience and can you tell us what's the limit in your system?



Yan XIONG March 25, 2005 12:31

Re: cluster - parallel speedup
The ratio:time used for communicating/ time used for computing on the single pc. If the ratio is relatively large (e.g. >1) , your speedup will be very low. I used to use 6 Pc nodes for computing, my speedup is about 5. In other word, the efficiency is about 5/6*100%.

Chen Xiaoming March 26, 2005 22:54

Re: cluster - parallel speedup
This depends on what kind of algorithm you are using and what type of platform you are work on. Certainly you have to do the tests by yourself. There are several metrics. I suggest you read Chapter 7 of Parallel Programming in C with MPI and OpenMP by Quinn. You can measure the execution speed and communication speed for your machine, then do the calculation.

You'd better worry about whether your code is scalable in terms of processors and memory. You don't really want to run small size problems using parallel computers, do you?

andy March 29, 2005 11:32

Re: cluster - parallel speedup
For our reasonably typical incompressible LES code (a mixture of explicitly and implicitly solved equations) and a grid size big enough to be useful (32^3 I think) the maximum number of usable PC nodes with fast ethernet was about 4. That is, using more than 4 made the simulation slower. Using 4 nodes gave about a doubling in performance relative to 1 processor. We got almost the same speed on 2 processors.

With gigabit ethernet the maximum number of usable nodes would appear to be around 50-100 for grid sizes of 20^3 - 30^3 on each processor.

Unless you are performing purely explicit simulations (e.g. particle codes, some compressible codes,...) fast ethernet is no longer a viable interconnect with current processors. It used to be OK for small clusters of Pentium IIIs though.

All times are GMT -4. The time now is 00:48.