CFD Online Discussion Forums - Unforeseen problems in scaling up a cluster built with desktop parts?

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)

- Hardware (https://www.cfd-online.com/Forums/hardware/)

- - Unforeseen problems in scaling up a cluster built with desktop parts? (https://www.cfd-online.com/Forums/hardware/88187-unforeseen-problems-scaling-up-cluster-built-desktop-parts.html)

The power consumption is around 130W per machine, so it's pretty much right at 2kW total.

I have not run a generic benchmark, but it is about 15% faster per core for our simulations (~50 million unstructured hex cell incompressible isothermal RANS with Star-CCM+ and OpenFOAM) than the supercomputer that we purchased time on. The supercomputer has dual Xeon 5680 per node with QDR Infiniband. I think we are faster because of better memory bandwidth. The Xeon's are limited to 1333mhz RAM and we are running ours at 1600mhz. I think there are also some inefficiencies in the multiple socket NUMA architecture, as not all memory access will be local to a socket.

I overclocked the chips from 3.4ghz to 4.0ghz and only saw about a 3% increase in solver speed. I think we would see some real gains from overclocking the RAM, but the stuff we bought is of too low quality to even boot beyond 1600mhz.

I'll run HPL.

Quote:

Originally Posted by kyle (Post 308464)

Unless you are doing something funky like saving the entire flow field history at a high time resolution, then your simulation speed is not going to be limited by filesystem I/O.

I would suggest otherwise. Our machine is bottlenecked at any point in which we write a flow solution. With a machine as large as 60+ nodes you will have to wait for I/O without some very expensive and carefully considered hardware. The good news is that steady state solutions don't write often but the bad news is that if you run unsteady and want to postprocess the time accurate data, you will likely become pretty frustrated without some pretty quick hardware.

Well yeah, that is the exactly the situation I described as when IO throughput is important...

I would say that this situation does not represent the majority of users. It only applies to people that are 1) running transient simulations, 2) care about the flow-field history, and 3) Cannot post-process on the fly. Sure, those three things apply to a lot of users, but probably not most. Certainly not in industry.