CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

Xeon Gold workstation config.

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree4Likes
  • 2 Post By flotus1
  • 2 Post By Blanco

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   February 6, 2018, 06:19
Default Xeon Gold workstation config.
  #1
Senior Member
 
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17
Blanco is on a distinguished road
Hello everyone,

I'm looking for some suggestion for a new CFD-3D workstation. I must use Xeon Gold processors, therefore there's no space for AMD Epyc here... Anyway, the workstation will be used for CFD-3D, consider unlimited number of paid licenses available, n° of cells will vary between 1e6 and 10e6, will probably reach 40e6 "rarely". Simulations, however, will involve combustion w/ detailed chemistry (quite heavy, hundreds of species), RANS but also LES/DDES in the near future. We also simulate spray, moving meshes, etc. Now, as far as the CPU is concerned, I'm considering the following alternatives:

Opt 1: 2x Intel Xeon 6136 3.0 2666MHz 12C -- reference cost -- 24 cores
Opt 2: 2x Intel Xeon 6148 2.4 2666MHz 20C -- cost +16% -- 40 cores
Opt 3: 2x Intel Xeon 6154 3.0 2666MHz 18C -- cost +29% -- 36 cores


All these share the same L3, 24.75 MB. Now, opt2 has the higher number of cores but the base frequency is the lowest among the three, and I suppose this is the reason for the lower cost compared to Opt3. Opt3 has 50% more cores than Opt1 and it costs 30% more...it seems reasonable to me. Therefore I would go with Opt3, with the aim to obtain the higher core number. Do you have any suggestion?

There could be another option actually:

Opt4: 2x Intel Xeon 6152 2.1 2666MHz 22C -- cost +31% -- 44 cores

The L3 in this case is higher, 30 MB, but the base frequency is very low therefore I would not consider this option. Am I wrong?

Other spec for wk config:
- Linux OS (SUSE/Ubuntu LTS)
- 96 GB (12x8 GB) DDR4 26666MHz ECC Reg RAM (dual rank DIMMs if possible)
- NVIDIA Quadro P1000 4GB
- 10GbE LAN
- 2x 1TB 72000RPM SATA Enterprise or SAS (15k possibly)

OT: does anyone have ever used SSD for CFD-3D? If yes, M.2 or PCIe? Have you got any clear advantage?

Do you know any supplier of 4-socket workstation? If yes please PM.

Thanks!

Regards

Last edited by Blanco; February 6, 2018 at 10:36.
Blanco is offline   Reply With Quote

Old   February 6, 2018, 08:50
Default
  #2
Member
 
Knut Erik T. Giljarhus
Join Date: Mar 2009
Location: Norway
Posts: 35
Rep Power: 22
eric will become famous soon enough
If you see my post here,
OpenFOAM benchmarks on various hardware
there are some benchmarks of the 6148 processor. As you will see, scaling is poor past 16 cores. So even if you didn't mention it, I would say the 6130 processor is the best compromise of the Intel processors.
eric is offline   Reply With Quote

Old   February 6, 2018, 09:40
Default
  #3
Senior Member
 
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17
Blanco is on a distinguished road
Quote:
Originally Posted by eric View Post
If you see my post here,
OpenFOAM benchmarks on various hardware
there are some benchmarks of the 6148 processor. As you will see, scaling is poor past 16 cores. So even if you didn't mention it, I would say the 6130 processor is the best compromise of the Intel processors.
Hi Erik,

thanks for the post, I've just read your benchmark results. Maybe I'm getting something wrong, but the parallelization seems poor even if not all the cores were used (<20), which seems quite strange in my experience. Did you have all the 6-channel RAM populated? What were the other workstation specs?

Thanks!
Blanco is offline   Reply With Quote

Old   February 6, 2018, 10:16
Default
  #4
Senior Member
 
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17
Blanco is on a distinguished road
Sorry, yours are single-machine simulation time, that's why parallelization efficiency is well below 1 when increasing n of cores above 1, right?

I would add:

- Gold 6130 has 16C but base freq. of 2.1 GHz, which seems quite low to me, L3/cores is aligned with upper options. Theoretical clock is 33.6 GHz, which is the lowest amont other options above (36/48/54/46)

- can 2x Gold 6144 3.5GHz 8C or 2x Gold 6146 3.2GHz 12C be reasonable alternatives, even if with limited number of cores? Their theoretical clock is however low: 28 and 38 GHz.

I'm still in doubt that an higher number of cores will be the best choice or not, considering that I will "always" run using the highest amount of cores available

Last edited by Blanco; February 6, 2018 at 11:56.
Blanco is offline   Reply With Quote

Old   February 7, 2018, 06:40
Default
  #5
Member
 
Knut Erik T. Giljarhus
Join Date: Mar 2009
Location: Norway
Posts: 35
Rep Power: 22
eric will become famous soon enough
It's a single-socket machine, yes. The memory is 6 x 16 GB 2666 MHz. But as I also showed in the other thread, even on a dual socket machine you would not see a parallel efficiency of 1 due to the memory being a bottleneck.

The Gold 6142 is the same as the 6130 only with a higher base frequency. The question I guess is whether a 50% increase in price is worth it. The same for the 6144/6146, I do not think the increase in frequency (and price!) is worth it as long as you are able to utilize all the cores on the 6130. The cache difference is minimal. It would be nice to have some more benchmarks, though.
eric is offline   Reply With Quote

Old   February 8, 2018, 10:06
Default
  #6
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Puget systems put together an article that tried to make some sense of the mess that is Intels Skylake-SP lineup: https://www.pugetsystems.com/labs/hp...rs-Guide-1077/
They somehow managed to get non-AVX all-core turbo frequencies which are much more relevant than base clock or single-core turbo. Towards the end, they recommend a handful of processors for memory-bound applications with larger cache per core. These are the ones to pick for maximum performance at least when paying for licenses. Performance per dollar is a different topic entirely. I would not recommend CPUs with more than ~16 cores even if you are not on a per-core licensing scheme.

In terms of storage, I think the time for 15k HDDs is up for most applications. Expensive, loud, still bad for smaller chunks of data. Get an SSD instead. SATA/SAS or PCIe depends on your budget, anything is faster than spinning disks. HDDs are for long-term storage of larger data after you finished running and post-processing your results. Here 7200rpm or even 5400rpm drives are good enough.
Blanco and Tobi like this.
flotus1 is offline   Reply With Quote

Old   February 8, 2018, 11:39
Default
  #7
Senior Member
 
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17
Blanco is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
Puget systems put together an article that tried to make some sense of the mess that is Intels Skylake-SP lineup: https://www.pugetsystems.com/labs/hp...rs-Guide-1077/
They somehow managed to get non-AVX all-core turbo frequencies which are much more relevant than base clock or single-core turbo. Towards the end, they recommend a handful of processors for memory-bound applications with larger cache per core. These are the ones to pick for maximum performance at least when paying for licenses. Performance per dollar is a different topic entirely. I would not recommend CPUs with more than ~16 cores even if you are not on a per-core licensing scheme.
Thanks a lot for the link, I found it very useful!
I've found the source for the frequencies used in the article you linked, they come from Intel: https://www.intel.com/content/www/us...ec-update.html

I think my CFD-3D app is using AVX-512, it's a commercial CFD-3D SW and I've found a benchmark where AVX-512 was explicitly cited as "useful" for simulations performed with this SW. Considering this, if I try to create a "cost/performance" rank of the processors, by computing the "performance index" as in the linked article, I have:

- 1st place: 6140 18C -- performance index w/ AVX512 604
- 2nd place: 6148 20C -- performance index w/ AVX512 704
- 3rd place: 6154 18C -- performance index w/ AVX512 777

All others Gold processors I've checked (range 6128-6152) have higher cost/performance ratio and lower performance index. Among all the processors I checked, the 6154 is showing the highest performance index w/ AVX512.

If however I consider NON-AVX performance, I get

- 1st place: 6140 18C -- performance index w/ AVX512 864
- 2nd place: 6130 16C -- performance index w/ AVX512 716
- 3rd place: 6148 20C -- performance index w/ AVX512 992

and the 6154 is in the 4th place but it still has the 2nd higher NON-AVX performance index (1065).

I'll look more closely to the SW I'm using to confirm it effectively use the AVX512 technology, but in any case it seems to me that the 6140 is the best solution in terms of cost/performance, while the 6154 is probably the best "high performance" solution. I think I'll go with the 6154 if not bounded by the total cost. This processor is also suggested in the article for both NON-AVX and AVX-512 processes.

As a side note from the article: "The 5122 stands out as the processor with the highest AVX-512 All-Core-Turbo. It is just 4 cores but they are all running full speed. It also has the largest per core Cache. This could be a good processor for memory bound programs and/or programs that don't have good parallel scaling but do take advantage of AVX512 vectorization." -> this could be a good solution for me if considering more than 1 workstation, but at the moment I'm looking for a single workstation.

Quote:
In terms of storage, I think the time for 15k HDDs is up for most applications. Expensive, loud, still bad for smaller chunks of data. Get an SSD instead. SATA/SAS or PCIe depends on your budget, anything is faster than spinning disks. HDDs are for long-term storage of larger data after you finished running and post-processing your results. Here 7200rpm or even 5400rpm drives are good enough.
Thanks a lot, yep I'll go on with SSD for running and enterprise SATA for storage.
Blanco is offline   Reply With Quote

Old   February 8, 2018, 12:18
Default
  #8
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
I can not over-emphasize how useless such a "performance index" aka aggregate CPU frequency is especially when it comes to CFD. With high core count processors like these, you will hit the wall called memory bandwidth bottleneck long before their theoretical compute performance become relevant, rendering all high core count processors more or less equal in terms of performance.
As a more realistic performance estimate I would recommend a model based on Amdahl's law with a parallel efficiency p of ~97%. It does not model the actual cause for sub-linear speedup in CFD, but fits the results pretty well: high core count processors are a waste of money.
Same applies to AVX and its newer variants. Ansys/Intel also mentioned AVX512 in one of their marketing presentations surrounding Skylake-SP. But as a matter of fact, most of the performance increase they found is caused by the increase in memory performance. AVX doesn't help much in memory bandwidth bound applications. But in the end it doesn't really matter, all processors you are currently looking at have the same AVX capabilities enabled. So if you should encounter a workload that actually benefits from AVX, they are up to the task.
flotus1 is offline   Reply With Quote

Old   February 8, 2018, 13:45
Default
  #9
Senior Member
 
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17
Blanco is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
I can not over-emphasize how useless such a "performance index" aka aggregate CPU frequency is especially when it comes to CFD. With high core count processors like these, you will hit the wall called memory bandwidth bottleneck long before their theoretical compute performance become relevant, rendering all high core count processors more or less equal in terms of performance.
As a more realistic performance estimate I would recommend a model based on Amdahl's law with a parallel efficiency p of ~97%. It does not model the actual cause for sub-linear speedup in CFD, but fits the results pretty well: high core count processors are a waste of money.
Same applies to AVX and its newer variants. Ansys/Intel also mentioned AVX512 in one of their marketing presentations surrounding Skylake-SP. But as a matter of fact, most of the performance increase they found is caused by the increase in memory performance. AVX doesn't help much in memory bandwidth bound applications. But in the end it doesn't really matter, all processors you are currently looking at have the same AVX capabilities enabled. So if you should encounter a workload that actually benefits from AVX, they are up to the task.
I see your point, yes the formula used in the article does not consider memory bandwidth. Is it possible to account for the memory bandwidth by scaling the real core number down to an "effective" core number? I mean

Effective cores = 1/(1-eff.+eff/real_core_n)

Then we could compute the performance_index by considering this effective number of cores.
I've tried this and the best CPU for cost/perf ratio (AVX and NON-AVX) from this analysis is the 6126 12C if I did thing correctly.
From the performance perspective, however, the 6154 18C is still be the best solution, and it is in the 5th position in the cost/perf rank...
Maybe I'll post the table if I have time to arrange it properly
Blanco is offline   Reply With Quote

Old   February 8, 2018, 15:19
Default
  #10
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
As I said, Amdahl's law with a scaling of 97% models this behavior pretty well despite the fact that it was invented to model parallel sections of a code. For some cases it might be 98%, in most cases it is lower.

In case you are struggling with Amdahl's law: the formula to calculate an estimated performance index P_N with N cores is
P_N=\frac{f_N}{(1-p)+\frac{p}{N}}.
Here f_N is the CPU frequency while running with N cores and p is the parallel efficiency -typically between 0.9-0.98.

Last edited by flotus1; February 9, 2018 at 10:04.
flotus1 is offline   Reply With Quote

Old   February 13, 2018, 04:34
Default
  #11
Senior Member
 
Blanco
Join Date: Mar 2009
Location: Torino, Italy
Posts: 193
Rep Power: 17
Blanco is on a distinguished road
I finally found time to collect the info I gathered through this topic. I post the summary tables I created using the CPU costs from Intel website (my supplier has slightly different costs, but the resulting rank is still the same)

In the attachment you will find:
- Performance rank

performance.jpg

- Cost over performance rank

cost_ov_perf.jpg

- Performance over core rank

perf_ov_core.jpg
Gweher and Crowdion like this.
Blanco is offline   Reply With Quote

Old   February 13, 2018, 08:12
Default
  #12
Senior Member
 
Gweher's Avatar
 
Gwenael H.
Join Date: Mar 2011
Location: Switzerland
Posts: 392
Rep Power: 20
Gweher will become famous soon enough
Thanks Blanco

I'm also currently looking for a new workstation configuration with similar requirements so this topic is a great source of information, also mentioning Alex's valuable answers in several other threads
Gweher is offline   Reply With Quote

Old   April 26, 2019, 10:23
Default
  #13
New Member
 
Sibel
Join Date: Apr 2017
Posts: 18
Rep Power: 9
tsibel is on a distinguished road
Hello everyone


I need your help for my configuration. We will buy a workstation for our CFD group at the university. We will model mostly two phase flow

CPU 2xIntel Xeon Gold X2 6140 or 2xIntel Xeon Gold X2 6148
RAM: 64GB LRDIMM Samsung DDR4-2666, CL19, reg. ECC (8x 64GB = 512GB) or 8X32=256GB
NVIDIA Quadro P2000 5 GB GDDR5
Mainboard: ASUS WS C621E Sage, Dual So. 3647; E-ATX
SSD: 512GB Samsung 970 Pro, M.2 PCIe (MZ-V7P512BW)
HDD: 6TB Seagate IronWolf Pro NAS, SATA3, 7200RPM (ST6000NE0023)
And some questions:
1) Do you have a suggestion for the cooling system?
2) Does the GPU provide additional performance for academic studies? Or should the money for CPU?
3) I've searched the AVX512, but I don't really understand. It seems that it has both negatives and positives?
4) Should I use SSD for the ANSYS installation? If so my chose is enough?

Many thanks
tsibel is offline   Reply With Quote

Old   May 3, 2019, 14:42
Default
  #14
New Member
 
Joshua Brickel
Join Date: Nov 2013
Posts: 26
Rep Power: 13
JoshuaB is on a distinguished road
Quote:
Originally Posted by tsibel View Post
Hello everyone


I need your help for my configuration. We will buy a workstation for our CFD group at the university. We will model mostly two phase flow

CPU 2xIntel Xeon Gold X2 6140 or 2xIntel Xeon Gold X2 6148
RAM: 64GB LRDIMM Samsung DDR4-2666, CL19, reg. ECC (8x 64GB = 512GB) or 8X32=256GB
NVIDIA Quadro P2000 5 GB GDDR5
Mainboard: ASUS WS C621E Sage, Dual So. 3647; E-ATX
SSD: 512GB Samsung 970 Pro, M.2 PCIe (MZ-V7P512BW)
HDD: 6TB Seagate IronWolf Pro NAS, SATA3, 7200RPM (ST6000NE0023)
And some questions:
1) Do you have a suggestion for the cooling system?
2) Does the GPU provide additional performance for academic studies? Or should the money for CPU?
3) I've searched the AVX512, but I don't really understand. It seems that it has both negatives and positives?
4) Should I use SSD for the ANSYS installation? If so my chose is enough?

Many thanks
Last time I checked Ansys Fluent/CFX were not taking advantage AVX-512. but then all the modern Intel Xeon Gold chips support it. It won't hurt, it probably won't help either. Basically from what I understand it allows a number of computations to be done at once, but the program in question must be designed to use this functionality.

I would actually suggest you look more closely at the CPU you want. You may be get the same bang for you buck going with a slightly fewer number of cores. I recently posted something on this.

My experience is that CFX does not gain any advantage from a high end graphics processor for solving. But if you are going to use Fluent, then it might be different.

If you are doing transient tests (and recording transient information), then it might be useful to have at least a SSD for the solution drive. You might want to also consider a SSD on a NVMe bus.
JoshuaB is offline   Reply With Quote

Old   June 8, 2019, 02:55
Default
  #15
Member
 
Ivan
Join Date: Oct 2017
Location: 3rd planet
Posts: 34
Rep Power: 9
Noco is on a distinguished road
Quote:
Originally Posted by tsibel View Post
Hello everyone


I need your help for my configuration. We will buy a workstation for our CFD group at the university. We will model mostly two phase flow

CPU 2xIntel Xeon Gold X2 6140 or 2xIntel Xeon Gold X2 6148
RAM: 64GB LRDIMM Samsung DDR4-2666, CL19, reg. ECC (8x 64GB = 512GB) or 8X32=256GB
NVIDIA Quadro P2000 5 GB GDDR5
Mainboard: ASUS WS C621E Sage, Dual So. 3647; E-ATX
SSD: 512GB Samsung 970 Pro, M.2 PCIe (MZ-V7P512BW)
HDD: 6TB Seagate IronWolf Pro NAS, SATA3, 7200RPM (ST6000NE0023)
And some questions:
1) Do you have a suggestion for the cooling system?
2) Does the GPU provide additional performance for academic studies? Or should the money for CPU?
3) I've searched the AVX512, but I don't really understand. It seems that it has both negatives and positives?
4) Should I use SSD for the ANSYS installation? If so my chose is enough?

Many thanks
We spent some time to choose between Xeon and Epyc and decide to buy Epyc (price per performance)

We bought 2x7351 with Supermicro
Ram same
GPU - no
Overclocking - yes, about 2,6-2,7 (with 2,8-2,9 after 25-30 hours we had mistakes), but I know 2 computers with 7351 which work stable at 3.0-3.1.
Cooling - Noctua, plus we instal large powerfull mitsubishi air conditioning system to have +15 C in server room

P.S. Before mitsubishi air conditioning we have some experience with different water in-case, near-case systems - need too many time for maintenance, and normally one time per year you need to change O-rings, add liquid, etc.
Noco is offline   Reply With Quote

Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Xeon workstation: suggestions needed Blanco Hardware 23 September 2, 2016 17:29
home workstation (on a budget): dual xeon e5-2630 v4 vs i7 5960x? foxmitten2 Hardware 1 June 8, 2016 13:09
Workstation with new E5 Xeon Ivy-bridge Manuelo Hardware 23 November 24, 2014 15:11
need opinion Workstation 2x Xeon e5 e2690 ? laxwendrofzx9r Hardware 6 June 5, 2012 10:04
PC vs. Workstation Tim Franke Main CFD Forum 5 September 29, 1999 16:01


All times are GMT -4. The time now is 08:11.