CFD Online Discussion Forums

CFD Online Discussion Forums (https://www.cfd-online.com/Forums/)
-   Hardware (https://www.cfd-online.com/Forums/hardware/)
-   -   CFD workstation configuration calling for help 2 (https://www.cfd-online.com/Forums/hardware/228540-cfd-workstation-configuration-calling-help-2-a.html)

Freewill1 July 5, 2020 02:38

CFD workstation configuration calling for help 2
 
1 Attachment(s)
Hi,
I post a thread previously here:

CFD workstation configuration calling for help

After reading Alex's constructive suggests, and making a survey in the past few days, I think my main goals is more clear:
to achive over all performance and good scalability for CFD-related job, i.e.,
  • OpenFOAM running up to 10~100M FV nodes
  • Algorithm testing and verification for my own CFD code developing aiming at good parallel performance.

For both jobs, there will be a lot of running of parallel code to solve large-scale linear equations, and I don't want the limitation hardware to get in the way)

Therefore, I am wondering if it possible to reach my key goals with a well-scaling 64-core, 256-GB-RAM machine (or machines).
If so, I don't need to spend much budget for expensive and uneconomical hardware mentioned in the previous thread ($15000~$20000, costs much but of no much use).

So, I thinks its rational to pursue good parallelism (scalability).

Given that Per-Socket Theotetical Memory Bandwidth (204.8 GB/s for EPYC Rome) serving as the memory wall, it seems that the Cores Per RAM Channeal is the key to achieve this scalability for CFD.

According to other people's test (Benchmarking Epyc, Ryzen, and Xeon: Tyranny of Memory) , one should not put more than 2~3 cores per channel in his machine(s).

Attachment 78862

Bearing this in mind, a good hardware configuration should include as many memory channels as possible for given budget for CFD usage.

Thus, let one CPU on one socket as a basic computing unit, the above test suggests, take EPYC ROME CPUs as an example, to squeeze their the potentials before breaking the 204.8 GB/s memory wall, ideal cores number for each of this unit are:

8 channelx(2~3) cores/channel = 16~24 cores

If so, computing unit with > 24 core count CPUs are not ideal for the desired scalability, but have to pay much more for useless extra cores (e.g., the 32-core EPYC 7452).

On the other hand, computing unit with < 16 core count CPUs (e.g., 8-core EPYC 7262) can also be avoided because of its low-density (only 8 cores occupying a socket on the motherboard).

Besides, if choosing low core counts CPUs, the extra benefits is their are cheaper (both price and price/core), e.g., for each dual-socket node, prices for key components in my country are:
-CPU:
  • EPYC 7262 ( 8c/3.20GHz)x2: $ 420x8 = $840
  • EPYC 7302 (16c/3.00GHz)x2: $ 970x2 = $1,840
  • EPYC 7402 (24c/2.80GHz)x2: $1,310x2 = $2,620
  • EPYC 7452 (32c/2.35GHz)x2: $1,770x2 = $3,540
  • EPYC 7371 (32c/3.60GHz)x2: $ 950x2 = $1,900 (with 3.6GHz all core turbo capability, and cheaper)
-RAM:
  • 3200MHz, RECC, 16GBx16 = 256GB (for one dual-socket node): ~$170x16 = $2,700
  • 3200MHz, RECC, 8GBx32 = 256GB (for two dual-socket nodes): ~$105x32 = $3,360

-Motherboard: Supermicro H11DSi Rev2.0: ~$560

There are several options to built a ≥64-core machine or machines:
  • option 1: one-node configuration (4 cores/channel): EPYC 7452 (32 cores each)x2 CPUx1 node + 16GBx16 RAM + H11DSix1: ~$6800
  • option 2: one-node configuration (4 cores/channel): EPYC 7371 (32 cores each)x2 CPUx1 node + 16GBx16 RAM + H11DSix1: ~$5200
  • option 3: two-node configuration (3 cores/channel): EPYC 7402 (24 cores each)x2 CPUx2 node + 8GBx16x2 RAM + H11DSix2: ~$9800 (NOTE: 96 cores in total)
  • option 4: two-node configuration (2 cores/channel): EPYC 7302 (16 cores each)x2 CPUx2 node + 8GBx16x2 RAM + H11DSix2: ~$9600
  • option 5: four-node configuration (1 core/channel): EPYC 7262 (8 cores each)x2 CPUx4 node + 8GBx16x4 RAM + H11DSix4: ~$12500 (NOTE: a little reluctant here to put in 512GB to fill all 64 memory slots to feed EPYC CPU's 8 channels capability)

Questions:
- What configuration can achive best scalability, if putting aside the cost?
- In view of the rule of thumb that favors no more than 2~3 cores/channel in a machine/machines, should I stick to option 3, 4 and even 5? Again, slow node-to-node interconnection and its potentially being cumbersome hardware/hardware setup is my concern.
- Although option 1, 2 are not good for scalability, its advantage seems obvious:
  • avoid node-to-node interconnection (10Gbps InfiniBand stuff, several OSs...sounds tedious and repetitive for a two/four-node mini cluster if so)
  • need only one case intead of two (also tedious and repetitive if so)
  • cheaper as a whole (although budget is not my main concern)
Any suggestions?

flotus1 July 5, 2020 15:04

I feel like you went into this with the wrong optimization goal.
Scaling is nice and all to highlight the importance of machine balance. But what you probably want is maximum absolute performance.

Quote:

- What configuration can achive best scalability, if putting aside the cost?
You are not paying for per-core licenses with OpenFOAM or your own code. So you are not optimizing for maximum performance per core, but for maximum overall performance.
This means that you can safely ignore any Epyc CPU with less than 24 cores.
The cost of a compute node is not only CPU+board+RAM. Once you factor in the rest of the components, a large number of low core count CPUs becomes less cost-efficient.
BTW: Epyc 1st gen is a no-go with your budget and requirements. Especially the 7371 (16 cores, not 32), which features quite an inflated retail price.

It's either the 7352 (24C) or 7452 (32C). The former if you want to save budget in order to afford more than one machine. The latter if you want to max out the performance of a single machine.
Lower core counts are for people with very strict budget limits, or with very high per-core license costs.

Quote:

In view of the rule of thumb that favors no more than 2~3 cores/channel in a machine/machines, should I stick to option 3, 4 and even 5? Again, slow node-to-node interconnection and its potentially being cumbersome hardware/hardware setup is my concern.
I had hoped that I could clear up the node-interconnect issue. Or non-issue to be more precise.
Hardware and software setup is something you will have to decide for yourself, based on your level of expertise and willingness to spend time on the compute setup, instead of doing something productive.
There is no value in having twice the compute performance, if you have to spend most of your time learning how to set it up, instead of working on your code or actually doing simulations.

Freewill1 July 7, 2020 04:29

Thanks, Alex.

I was more clear about what I need: looing for a 64-core machine/machines with good scalability for CFD usages.

Taking overall performance, CPU-memory balance, and cost-efficiency into consideration, perhaps a dual-socket machine is the one for me.

This is my hardware list with queried prices:
  • CPU: EPYC 7532 x 2 (32x2 cores, each with doubled 256MB L3 cache if compared with 7452 etc.): $1550x2 = $3100
  • RAM: 3200MHz, 2R reg ECC, 32GBx16 = 512GB: $183x16 = $2930
  • Motherboard: Supermicro H11DSI dual-socket: ~$563x1 = $563
  • SSD: Samsung 970 EVO Plus 1TB: $280x1 = $280
Total price: ~$6870

flotus1 July 7, 2020 05:16

The Epyc 7532 is definitely the fastest 32-core CPU you can get for your application. I usually don't recommend it, because AMDs 1k unit price is 3350$, and barely any retail availability. That's just too much of a premium for a slight performance increase over the cheapest 32-core Rome CPU.
If you can get one for only 1550$, it is a no-brainer.

Edit: as always, make sure to get a revision 2 board. Getting these CPUs to run on rev. 1 would require some hacks. I only read from other people how it could be done, never went there myself.

Edit2: Probably many people, but definitely myself, would be very interested to know where you found these CPUs for such a low price. Please share

Freewill1 July 7, 2020 23:07

Quote:

Originally Posted by flotus1 (Post 777063)
Edit2: Probably many people, but definitely myself, would be very interested to know where you found these CPUs for such a low price. Please share

I checked the prices on the Taobao (chinese edition of eBay) from three sellers, and they replied that Epyc 7532 is available for ~$1550 - $2110:

NOTE: what on the URLs are not for the actual prices, I inquired the prices via customer service, who said they can provide the CPU if needed, at least in China.

flotus1 July 8, 2020 03:59

Something seems fishy about these.
The first one for example has an image of AMDs spec sheet that shows the specifications of a different CPU.
And they claim these CPUs are new. Now I have seen plenty of retail/OEM Epyc Rome CPUs way below their initial price already. But those were used CPUs.
If it's too good to be true, it probably is. Why would they sell AMDs most expensive 32-core CPU, at a lower price than AMDs chapest 32-core CPU.
Not saying you should not do it, it's your money and your decision whether you want to trust these sellers. But I certainly would not buy that.

Freewill1 July 8, 2020 21:17

Yes, I will be careful about these sources of supply. Their usually put irrelevant pictures on their online stores.


All times are GMT -4. The time now is 18:46.