CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Main CFD Forum

To Jonas: Linux Cluster

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
Old   June 20, 2002, 02:52
Default To Jonas: Linux Cluster
  #1
Mark Render
Guest
 
Posts: n/a
Hi Jonas,

I just saw you in that Fluent newsletter in front of your 150 node cluster.

We have the same here, but with only 3x8 nodes. We didn't increase the number of nodes per cluster because we thought that with standard ethernet connection the efficiency regarding computing time might decrease considerably. How is your experience with the speed up vs. number of nodes used for one job ?

Have you considered to switch to some faster connection like GigaBit-ethetnet or myrinet ? Or even tested it ?

One year ago, the price for a fast connection within the cluster was the same as for the the cluster nodes itself, so we decided to stick to standard 100 Mbit ethernet. But time might have changes meanwhile.

Regards,

Mark

  Reply With Quote

Old   June 20, 2002, 04:19
Default Re: To Jonas: Linux Cluster
  #2
Jonas Larsson
Guest
 
Posts: n/a
The speedup is very dependent on problem size. From my experience you get good speedup with Fluent down to about 50,000 cells per CPU. With even more CPUs and less cells per CPU communication overhead starts to become a problem. Scaling is of course dependent on which models you run - things like sliding meshes, discrete phase etc. can deteriorate scaling. Very large problems also often scale a bit worse. To summarize; a job with 50,000 cells doesn't parallelise very well, a job with 500,000 cells runs well on up to 10 CPUs and a job with 5,000,000 cells runs well on up to about 70 CPUs (scaling often a bit worse for very large problems).

I think that these numbers are quite typical for most commercial codes that have mature parallelizations. With our in-house code (an explicit structured Runge-Kutta code which is easy to parallelize) scaling is much better though - a 1 million case runs well on 100 CPUs.

Our cluster is 1.5 years old now and has PIII 1GHz CPUs. With faster CPU's the scaling problem becomes worse - a faster CPU needs more communication bandwith to keep it happy. We rarely use more than 50 CPUs for one Fluent simulation. A typical Fluent simulation has about 1 million cells and runs on 15-20 CPUs.

The cluster is used by 15 active CFD users at our CFD department, so running a 150 CPU job on your own requires some diplomatic skills ;-) After we switched to linux clusters we have removed all que-system and load-balancing things - they were too expensive and created a lot of administrative overhead. With the low cost per CPU for linux clusters it makes much more sence to simply buy more CPU when needed, instead of forcing people to wait in a que system. Everyone here are very happy to avoid the hassle with a que system - it has worked great for our department and average CPU usage has been very high (>70%). For inter-departmental clusters things might be different though. To avoid diplomatic problems we have bought separate clusters for each department that uses CFD.

About faster networks - I haven't checked prices lately, but I think that they are still quite expensive. That a faster network will double the cost per CPU sounds reasonable. We haven't tested any faster networks. However, I have looked at a few bechmarks from others. Fluent and HP have tested myrinet. The results can be found on Fluent's web site and are interesting, although a couple of years old by now. Scali (see the sponsor list) has also benchmarked Fluent on a cluster with SCI interconnect.

My impression from this is that a non-standard faster network can only be justified if you want to run very large cases (say 10 million cells or more) or if you for some reason want to run small cases extremely fast (convergence in minutes) - could be needed for automatic optimization routines or similar. For normal cases, where you don't have more than a few million cells and are happy to have a converged solution in a few hours or at worst over night, standard 100 mbit fast ethernet is they way to go I think. I also like concept of using standard off-the-shelf components as much as possible - it will make administration and future upgrades much easier.
  Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Dual Boot Windows and Linux and Go Open Source andyj Main CFD Forum 2 October 21, 2010 16:49
Linux Cluster Manager from SGI out for X86 Shelly Zavoral Main CFD Forum 0 February 13, 2009 13:26
Linux Cluster Performance with a bi-processor PC M. FLUENT 1 April 22, 2005 09:25
Linux Cluster Setup Problems Bob CFX 1 October 3, 2002 18:08


All times are GMT -4. The time now is 10:50.