CFD Online Discussion Forums - Recommended Cluster Configuration

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)

- Hardware (https://www.cfd-online.com/Forums/hardware/)

- - Recommended Cluster Configuration (https://www.cfd-online.com/Forums/hardware/145031-recommended-cluster-configuration.html)

Recommended Cluster Configuration

We have been running CFD on off-the-shelf linux clusters using desktop PC's with Ethernet connectivity since 2000. Currently we have more than 700 PC's in our cluster. We have decided to look at switching to a more modern blade cluster in a rack using faster interconnect.

What we are looking at now is a cluster setup using:

Mellanox Infiniband FDR 56 GB/s 36port switches
Dual-CPU blades with two Xeon 2690v3 CPUs with 2*4 = 8 RAM sticks
Perhaps a GPU section in the rack (seems to be very good for our in-house solver, while benefit for commercial codes is questionable)

Does this sound like a good cluster setup? Any other suggestions or things that we must think about?

Our application is in aerospace (compressible CFD). We run an in-house explicit runge-kutta RANS/LES solver, CFX, Fluent and soon also Numeca.

Hi Jonas,
Impressive configuration, what are you using as a job scheduler?
What is the size of the jobs you are running, does it require infiniband connectivity?
You setup is good, what is the size of the RAM sticks ?

Small commodity cluster configuration

I wonder how the recommendations would change for smaller clusters, particularly tight-budget systems.

I am looking at building a small cluster-in-a-box for openFoam multiphase work in the ~5-50M cell range. I have access to cheap & intelligent labor in the form of a Post Doc, so we should be able to cut capital costs with some tinkering. A commercially available 4-node cluster-in-a-box using the 2-channel Haswells looks promising, but I'm looking for more.

Is anyone aware of where one might find hardware to build such a multiple-motherboard system? If not, I'll gut a 2nd-hand rackmount system. I would also be receptive to feedback on my proposed configuration. It seems almost too simple to be possible.

My current favorite concept is a Warewulf diskless, stateless GigE stack of consumer mATX motherboards with a i7 5820K cpu and 16 GB ram (non-ecc) on each. The cost should be in the range of $1200 per node with CPU trays and cooling. I plant to overclock the RAM, but may even underclock the CPU. Cooling would probably be by custom water loops for compactness, but I haven't ruled out the huge $50 heat-pipe air coolers. I also haven't quite decided how to handle the power switching for the nodes, but hope to use a single power supply with ATX splitters per 4-nodes, and have one board turn off the power. I'll start with 4 nodes and may scale to perhaps 16 nodes with the next (14nm) generation of cpus. I doubt I'd scale past that because the capital costs would outweigh the benefit for my needs.

Regardless of what the cluster looks like, it will be a steep initial learning curve for me, because I'm currently using a single 3930K with 16 GB ovrclocked RAM. It takes days to get useful results (e.g. Many days for 4 minutes of a 2M cell separator in interfoam).

I use a 192 core 12 node cluster w/ 40Gb infiniband. Compute nodes are 2x Xeon E5-2667v2 8C 3,3GHz 25MB (turbo enabled); 64GB DDR3-1866 ECC; SAS HDD

My experience is:
- go for GHz
- get a perfect tune (improved our performance a lot!)

Sorry for my ignorance, but what is a "perfect tune"? Performance tuning of the Infiniband?