CFD Online Discussion Forums

CFD Online Discussion Forums (
-   OpenFOAM Installation (
-   -   Best cluster configuration (

ziad December 14, 2006 18:26

So the title sounds generic en
So the title sounds generic enough...

I am in the process of setting up a cluster for running OpenFoam in parallel and resources are limited (surprise, surprise!) so I would be very grateful for cost effective ideas and suggestions.

Basically I am looking for the best bang for my buck and the cluster should be easily upgradable when need arises. Multicore technology seems very interesting but how compatible is it with OF1.3? I remember vaguely a thread discussing problems running/compiling OF on such platforms...

Thanks for taking a shot... in the dark where I am right now! ;)


msrinath80 December 14, 2006 19:33

Invest in dual Opterons; and b
Invest in dual Opterons; and by that I mean two physical opteron CPUs on the same board. Not dual-cores. Dual cores are memory bandwidth limited (maybe not for all problems, but still the limitation is there). I am no interconnect expert, so perhaps someone else has better ideas. Even Gigabit might cut it for you once OpenFOAM is released with GAMMA support. I hear that there are significant improvements in latency reduction.

ziad December 14, 2006 22:54

Just checked out GAMMA at Ciac
Just checked out GAMMA at Ciaccio's homepage. Seems pretty interesting, especially since it is free and does not require any exotic hardware. I guess I should read up on it in more detail.

Any news for the release date of OF with GAMMA support? Is there a Beta version available to play with? I expect to be ready for testing within 8 weeks...

jens_klostermann December 15, 2006 03:39

Hi Ziad, We got an 10 node
Hi Ziad,

We got an 10 node 2xOpteron 280 cluster (4 core per node) with infinband.

some experiences on AMD Opteron:

-dual core vs. single core:
1x dual core cpu is about 1.5 time es fast as the same single core cpu, but is depending of the case (memory usage)
-if you check prices the dual core system will also be about 1.5 times of the single core system, if you buy the same number of nodes the dual core system may be little cheaper, because you need less peripheral devices (e.g. mainboards, harddisk, case, communication cards like infinband)
-some more advantages for dual core, it will take less space (was critical for us) will use less energy (lower running costs)
-ethernet vs. infiniband: ethernet scale worse if you use more than 3 nodes (12 cores), but it realy depends on the case. I am really interested, what performance increase the GAMMA support will bring. But for now we would recommend high speed interconnect if you buy more than 12 cores.

I don't know much about the new Xeon quadcore, but if you check the fluent benchmark at
the opteron 2X dualcores are still little better than the xeon quadcore

Some more OpenFoam benchmarking @

So these are our experiences in a nutshell.


ziad December 15, 2006 13:41

Hi Jens, Thanks for all the
Hi Jens,

Thanks for all the invaluable info. I guess I have the right starting blocks now. Hopefully I will have a chance to test GAMMA and maybe even post the results when available.


mattijs December 15, 2006 14:21

You can already use MPI/GAMMA
You can already use MPI/GAMMA (the MPI version that uses GAMMA). You'll have to install it and recompile the mpi Pstream:

cd $FOAM_SRC/Pstream/mpi
wmake libso

You will have to change the compilation flags to make it pick up the MPI/GAMMA mpi.h and the mpi libraries.

The compilation flags come from
where XXX depends on the installation (linuxAMD64Gcc4 for me)

- define a new WM_MPLIB in ~/.OpenFOAM-1.3/bashrc
- copy an existing rule file (e.g. mplibLAM) to mplibMPIGAMMA and edit it correspondingly
- build the Pstream/mpi library as above.

ziad December 16, 2006 19:31

zeer nuttig en makkelijker dan
zeer nuttig en makkelijker dan verwacht. Dank u wel Mattijs

What is the status of pure GAMMA support though? I am not at the point of testing yet but many of my runs will be on relatively small meshes so it might make a difference.


mattijs December 18, 2006 05:23

pure GAMMA support: Being work
pure GAMMA support: Being worked on by me and Giuseppe as we speak. Having start-up troubles :-(

Haven't heard about any problems with MPI/GAMMA though. And going to pure GAMMA gives only a small additional benefit compared to going from e.g. LAM to MPI/GAMMA.

ziad December 18, 2006 18:26

that's good enough for me. I g
that's good enough for me. I got enough on my plate for a little while anyway.


mattijs July 26, 2007 04:29

The GAMMA project page (http:/
The GAMMA project page ( has just released a comparison of some public domain MPI implementations with OpenFOAM. Follow the Performance->OpenFOAM-1.4 link.

(GAMMA+OpenFOAM1.4 is now very stable)

I put forementioned Pstream building instructions on the Wiki:

grtabor September 20, 2007 07:04

(Reviving a defunct thread her
(Reviving a defunct thread here - seemed better than starting a new one with the same title).

I'm sorting out the purchase of a cluster for CFD and OF. I'm looking at getting 6-8 two-processor nodes. The choice now for each processor is single, dual or quad-core. Looking at the price and availability, I see one can get a fast single-core (3GHz) or a dual core with a slower clock speed (2.3GHz) for the same cost (roughly... also thats what is available). I assume the quad core processors have lower clock speeds still. My question is: is it better to get a fast single-core processor or a slower (clock-speed) quad core? Would it be better even to get 4 nodes with twin quad-core processors, rather than 8 with single or dual-core processors? Does anyone have any experience with these new quad core chips, or any feeling about which way to go??


ziad September 20, 2007 08:53

Hi Gavin, No experience wit
Hi Gavin,

No experience with the quad-cores although they look quite interesting as far as space savings are concerned. I am slightly partial to dual-core with large cache because one can always fall back on a single core with lots of cache, which would typically run faster that a single-core with half the cache (unless of course the single-core is available with the same total cache available on dual-core). So rule of thumb, when comparing equivalent dual- and single-core, look at the cache!

Another issue would be the connectivity between the individual processors. Multiple cores offer true cross linking between the cores at chip level. Very efficient since one doesn't have to access the bus (read bottleneck). See AMD website, haven't had time to look at Intel ;)

It might be worthwhile in your case to compare 2-4 quads on one motherboard (Tyan?) to other single- dual-cores on several motherboards. That will give you a very good idea of how good your connectivity is between motherboards.

I promised some results back when this thread was started but then life (and management) decided otherwise, so the proposed machine is still "being considered". Hopefully something will give within a couple of months.


msrinath80 September 20, 2007 09:11

So rule of thumb, when compari
So rule of thumb, when comparing equivalent dual- and single-core, look at the cache!

I beg to differ. The rule of thumb for CFD calculations should be Memory bandwidth per core. Cache is secondary. I've done tests of OpenFOAM scaleup on quad-core and dual-core processors (both Intel and AMD). Check the Running/Solving CFD section for quad-core and Superlinear speedup threads. Quick summary frommy results: AMD still surpasses Intel in the Memory bandwidth area.

gschaider September 20, 2007 10:58

Srinath is right. That is the
Srinath is right. That is the reason why when Intel publishes Fluent-Benchmarks it only shows 1 number and AMD shows different case sizes (for the larger cases AMD is faster for the smaller Intel).

For an example of the current MultiCore-Xeons (I hear that the next generation should be better) look at the benchmark Eric published in the "OF 1.4 on MacOS X". They're not really and advertisment for Intel-Multicores (at least for CFD)

And one last thing: usually the Opterons are better in the Floating-Point comparisons than for interger (don't know if that might be important for CFD ;-) )

ziad September 20, 2007 19:22

I did mention that the cache p
I did mention that the cache point is for "when comparing equivalent dual- and single-core". Guess I should have been more specific. Anyway...

connclark September 21, 2007 12:36

I would say that the size of t
I would say that the size of the cache is more important when comparing one dual core cpu to another dual core cpu rather than to a single core. The dual core with the larger cache would likely have less bus contention. However with CFD/FAE the working memory set is typically so large a larger cache doesn't get you that much more performance.

All times are GMT -4. The time now is 16:22.