CFD Online Discussion Forums - To Jonas: Linux Cluster

The speedup is very dependent on problem size. From my experience you get good speedup with Fluent down to about 50,000 cells per CPU. With even more CPUs and less cells per CPU communication overhead starts to become a problem. Scaling is of course dependent on which models you run - things like sliding meshes, discrete phase etc. can deteriorate scaling. Very large problems also often scale a bit worse. To summarize; a job with 50,000 cells doesn't parallelise very well, a job with 500,000 cells runs well on up to 10 CPUs and a job with 5,000,000 cells runs well on up to about 70 CPUs (scaling often a bit worse for very large problems).

I think that these numbers are quite typical for most commercial codes that have mature parallelizations. With our in-house code (an explicit structured Runge-Kutta code which is easy to parallelize) scaling is much better though - a 1 million case runs well on 100 CPUs.

Our cluster is 1.5 years old now and has PIII 1GHz CPUs. With faster CPU's the scaling problem becomes worse - a faster CPU needs more communication bandwith to keep it happy. We rarely use more than 50 CPUs for one Fluent simulation. A typical Fluent simulation has about 1 million cells and runs on 15-20 CPUs.

The cluster is used by 15 active CFD users at our CFD department, so running a 150 CPU job on your own requires some diplomatic skills ;-) After we switched to linux clusters we have removed all que-system and load-balancing things - they were too expensive and created a lot of administrative overhead. With the low cost per CPU for linux clusters it makes much more sence to simply buy more CPU when needed, instead of forcing people to wait in a que system. Everyone here are very happy to avoid the hassle with a que system - it has worked great for our department and average CPU usage has been very high (>70%). For inter-departmental clusters things might be different though. To avoid diplomatic problems we have bought separate clusters for each department that uses CFD.

About faster networks - I haven't checked prices lately, but I think that they are still quite expensive. That a faster network will double the cost per CPU sounds reasonable. We haven't tested any faster networks. However, I have looked at a few bechmarks from others. Fluent and HP have tested myrinet. The results can be found on Fluent's web site and are interesting, although a couple of years old by now. Scali (see the sponsor list) has also benchmarked Fluent on a cluster with SCI interconnect.

My impression from this is that a non-standard faster network can only be justified if you want to run very large cases (say 10 million cells or more) or if you for some reason want to run small cases extremely fast (convergence in minutes) - could be needed for automatic optimization routines or similar. For normal cases, where you don't have more than a few million cells and are happy to have a converged solution in a few hours or at worst over night, standard 100 mbit fast ethernet is they way to go I think. I also like concept of using standard off-the-shelf components as much as possible - it will make administration and future upgrades much easier.