CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   Hardware (https://www.cfd-online.com/Forums/hardware/)
-   -   Specs for a compute and head nodes (https://www.cfd-online.com/Forums/hardware/224715-specs-compute-head-nodes.html)

rhythm March 2, 2020 05:24

Specs for a compute and head nodes
 
Hi folks,

So I have been tasked to do an upgrade of a cluster in the company that is coming to an end of its life.
To give a bit of a background on what the setup is currently - the cluster is based on beowulf structure with 14 compute workstations and 1 head node workstation in place and primarily these are used to run OpenFOAM simulations (pre-processing and computing, but not post-processing). Because the current system configuration is rather cumbersome to manage, the idea is to upgrade everything using OpenHPC package with stateless compute nodes, which hopefully will make it easier to manage the system.

There has been a budget of £10k assigned for this, which is obviously not enough to do a full replacement upgrade, therefore we have decided to do it incrementally over the next couple of years, whereby this year a head node will be replaced and a new compute node will be added (with possibly a few older ones being retired).

Based on the info on the benchmark sticky post I think that we will be going with the following compute node:
  • Tower Chassis, 8x 3.5" Hotswap Drive Bays 1280Watt Platinum Redundant Power Supplies;
  • X11DSI-NT, Dual 10GbE LAN, Decicated IPMI & Remote KVM, 16 Dimm, On Board Graphics;
  • 2 AMD EPYC ROME 7452, 32 Core 64 Threads, 2.35GHz, 128MB Cache, 155Watts;
  • 2 Supermicro 4U Active CPU Heat Sink for AMD EPYC CPU;
  • 16*16GB DDR4 2933MHz ECC Registered DIMM Module;
  • Onboard graphics;
  • 1280Watt Platinum Redundant Power Supplies.

Two questions that I have about this configuration:
- Would going with 155Watt CPUs rather than 180W or 225W impact the performance significantly? Most often the machine will be used to run 10-20million cell cases.
- Is 256GB too much or too little of memory? (I know that there's never too much memory, but DDR4 registered ECC is damn pricey :/ )


In regards to the head-node I wasn't able to find any conclusive advice. Here are some questions that I currently have:
- What recommendations would anyone have for the head node configuration?
- How much RAM one would need to have on it so that there was no bottleneck?
- Any other important things to consider so that the system worked with as little of bottlenecks as possible?

Thanks,
Ben

flotus1 March 2, 2020 13:31

Quote:

- Would going with 155Watt CPUs rather than 180W or 225W impact the performance significantly? Most often the machine will be used to run 10-20million cell cases.
Performance might differ a bit, but not nearly enough to justify the price increase for the higher TDP models.

Quote:

- Is 256GB too much or too little of memory? (I know that there's never too much memory, but DDR4 registered ECC is damn pricey :/ )
Too much or too little depends on what you need it for. But unless you are running extremely large models, 256GB is a decent amount of memory for a CFD compute node.
Btw, 2nd gen Epyc supports up to DDR4-3200. Your configuration only comes with DDR4-2933.
Also, note that 2nd gen Epyc CPUs are already available at much less than original retail prices: https://www.ebay.com/itm/100-0000000...r/233351393565
Also also, compute nodes with Epyc CPUs can be bought relatively cheap here: https://www.provantage.com/hpe-p16693-b21~7HPE962M.htm

Quote:

- What recommendations would anyone have for the head node configuration?
Depends entirely on what you want the head node to do. Just manage access and queuing to the compute nodes? Handle a unified storage space for all compute nodes? Pre- and post-processing?
Quote:

- How much RAM one would need to have on it so that there was no bottleneck?
Same here
Quote:

- Any other important things to consider so that the system worked with as little of bottlenecks as possible?
Ditto ;)
Just for managing access and holding a central storage system, you can probably get away with transforming one of your older workstations into a head node to save some money. If you are ok with the system retaining some of its DIY vibe, and you can cram enough storage into it.

rhythm March 10, 2020 10:22

First of all - thanks for your reply, it's really helpful to get some new input.

Quote:

Btw, 2nd gen Epyc supports up to DDR4-3200. Your configuration only comes with DDR4-2933.
Great I'll try and configure it with DDR4-3200.

Quote:

Also, note that 2nd gen Epyc CPUs are already available at much less than original retail prices: https://www.ebay.com/itm/100-0000000...r/233351393565
I've heard rumours that sometimes CPUs that can be found on eBay are engineering samples, therefore potentially can be a bit dodgy. I wonder if there is any truth to that? If so, then is there potentially a way to figure it out prior to purchasing it from markets like eBay?

Nonetheless, the difference in price is large enough to potentially justify the risk. Thanks for that.

Quote:

Also also, compute nodes with Epyc CPUs can be bought relatively cheap here: https://www.provantage.com/hpe-p16693-b21~7HPE962M.htm
Great, thanks!

Quote:

Just manage access and queuing to the compute nodes? Handle a unified storage space for all compute nodes? Pre- and post-processing?
My intent is that the head-node will do all of the management tasks:
- Compute node access and job queuing;
- Storage space (would probably have 4 SATAs on RAID1);
- DHCP server to assign IP addresses to the compute nodes;
- Do all of the stuff related to the network boot of the compute nodes;
etc...

No Pre/Post-processing or any other computationally intensive tasks will be done on it.

I guess whether a new head-node will be bought will depend on how much cash is left after we make our final decision on the compute node configuration. However, I am keen on upgrading as much as I can with this budget as the cluster is bloody slow at the moment, so I'll take any chance to improve it's performance :D

Again, thanks for your reply!

hammersxo August 14, 2020 08:55

Quote:

Originally Posted by flotus1 (Post 760175)
Performance might differ a bit, but not nearly enough to justify the price increase for the higher TDP models.


Too much or too little depends on what you need it for. But unless you are running extremely large models, 256GB is a decent amount of memory for a CFD compute node.
Btw, 2nd gen Epyc supports up to DDR4-3200. Your configuration only comes with DDR4-2933.
Also, note that 2nd gen Epyc CPUs are already available at much less than original retail prices: https://www.ebay.com/itm/100-0000000...r/233351393565
Also also, compute nodes with Epyc CPUs can be bought relatively cheap here: https://www.provantage.com/hpe-p16693-b21~7HPE962M.htm


Depends entirely on what you want the head node to do. Just manage access and queuing to the compute nodes? Handle a unified storage space for all compute nodes? Pre- and post-processing?

Same here

Ditto ;)
Just for managing access and holding a central storage system, you can probably get away with transforming one of your older workstations into a head node to save some money. If you are ok with the system retaining some of its DIY vibe, and you can cram enough storage into it.

Thanks for the https://roadrunneremails.net detailed explanation flotus.


All times are GMT -4. The time now is 22:03.