CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Hardware

Building a "home workstation" for ANSYS Fluent

Register Blogs Members List Search Today's Posts Mark Forums Read

Like Tree9Likes
  • 1 Post By Ep.Cygni
  • 2 Post By flotus1
  • 1 Post By lac
  • 1 Post By lac
  • 2 Post By flotus1
  • 2 Post By flotus1

Reply
 
LinkBack Thread Tools Display Modes
Old   September 26, 2017, 08:41
Default Building a "home workstation" for ANSYS Fluent
  #1
New Member
 
Join Date: Sep 2017
Posts: 3
Rep Power: 2
Ep.Cygni is on a distinguished road
Hello.
A collegue asked me to help him build a new home PC next month. Its sole purpose is to run flow simulations with large meshes using mainly ANSYS Fluent or corporate in-house code, but Fluent is of first concern. The license will be also corporate for as many cores as needed.

I have some expertise in PC hardware, but almost none in flow simulation - my own field of research is heat transfer. When solving heat conduction problems in Fluent, using just the energy equation, with mesh size from very small (20K) to relatively large (40M cells), I noticed that parallel performance did not scale very good - e.g. 4 threads were just 2-2.5x faster than serial, and it was more time-efficient to run a few serial cases simultaneously. But I realize this may not be the case for Navier-Stokes solver which can behave differently in parallel. That's why I went to this forum for recommendations.

My colleague's budget is not very tight for a home PC, e.g. a platform with Ryzen 7 1800X and 64 GB RAM is totally fine for him price-wise, however I'd like to keep cost efficiency in reasonable limits and avoid $2000 high-end CPUs in favor of slightly slower but not overpriced models. Server platforms are not considered since the extra features they offer are unnecessary (except RAM capabilities maybe).

With that said, I'd like to ask a few questions. They are not about the exact config - it will be chosen later according to budget and availability (although suggestions are still welcome), but rather about influence of different factors on performance in the specific task of flow simulation in ANSYS Fluent with large meshes (say 30M cells) and using density-based solver.

The questions are:

1. How well does Fluent Navier-stokes parallel performance scale with thread count? (And what is better then - more cores or higher per-core performance?)

2. How important is RAM bandwidth for this kind of workload? Is it worth buying a TR4 or LGA2066 platform for extra memory channels, or high-frequency DIMMs for a 2-channel system? (The second question is more relevant to Intel platforms since AM4/TR4 requires fast RAM anyway.)

3. How do Skylake-X processors with their new mesh topology and cache architecture compare to previous generations in terms of Fluent performance? (Reviews say they have higher inter-core data transfer latency, but does it matter in CFD that much?)

4. Similarly, do Threadripper CPUs suffer from non-uniform Infinity Fabric latency (different access time between cores in same and different dies) when used in CFD? (Again got this info from reviews, but they mostly focus on games unfortunately.)

5. Maybe a stupid question that I've had for a while but couldn't find any info on it: does Fluent (and CFD-post) use GPU acceleration in 3D scenes and is it worth getting a powerful graphics card for the new machine? (Forgot to say - GPGPU is not considered due to solver limitations and memory requirements.)

Any advice and clues are welcome, thank you in advance.
flotus1 likes this.

Last edited by Ep.Cygni; September 26, 2017 at 17:06. Reason: caught a few typos
Ep.Cygni is offline   Reply With Quote

Old   September 26, 2017, 17:39
Default
  #2
Senior Member
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 1,620
Rep Power: 26
flotus1 will become famous soon enoughflotus1 will become famous soon enough
Nice, a list of well-formulated questions:

1. Fluent can scale pretty well on large amounts of cores. Depending on the case and the computer architecture up to hundreds or even thousands of cores. That being said, scaling on a single node is usually limited by one factor: memory bandwidth. Apart from that, a higher "per-core" performance is usually more desirable than very high core counts, partly because most tasks in pre- and post-processing do not scale on many cores.

2. Core counts and memory bandwidth should be balanced. An extremely expensive 18-core desktop processor will be severely limited by a lack of bandwidth and thus a waste of money especially for CFD.
The mainstream platforms with only two memory channels are not an option if you can afford more.
Btw: in my opinion the notion that Ryzen has to be paired with faster memory and Intel processors do not is a myth to some extent. Faster memory helps processors from both manufactures in memory-intensive workloads.

3. For CFD applications, Skylake-X in its default configuration is slightly slower than its predecessor in terms of per-core performance. One of the reasons is indeed the cache architecture that is once again slower than in the previous generation.

4. Fluent and its MPI parallelization do not suffer as much from the slower inter-CCX communication as many mainstream applications do. The reason being that MPI tends to minimize data transfer between cores to some extent.

5. You don't necessarily need an expensive graphics card for pre- and postprocessing. I would recommend a GTX 1050TI 4GB as a minimum or a GTX 1060 6GB as a step-up.


My advice would be to determine how many parallel licenses your colleague has available, how many cells his largest simulation models consist of and if he is using solver types or additional physical models that further increase the memory requirement.
That should help narrowing down which platform is best for him.
Edit: should have read the whole post. "Unlimited" amount of parallel licenses and ~30M cells

Quote:
Server platforms are not considered since the extra features they offer are unnecessary (except RAM capabilities maybe)
You might want to reconsider this statement. Server platforms, especially dual-socket configurations, are very favorable for CFD due to the increased memory bandwidth from 2 CPUs.
lac and Ep.Cygni like this.
__________________
Please do not send me CFD-related questions via PM

Last edited by flotus1; September 27, 2017 at 04:25.
flotus1 is offline   Reply With Quote

Old   September 28, 2017, 06:40
Default
  #3
lac
New Member
 
Join Date: Apr 2016
Posts: 12
Rep Power: 3
lac is on a distinguished road
My experince for the incompressible simple solver in Fluent was, that 8GB of memory was barely enough for cases with 4M cells. But I never solved "bigger" cases with Fluent. Recently I'm using Openfoam, and 64GB memory is only enoguh for around 25M cells when using simpleFoam with k-eps modells. So maybe Ryzen is not the best option, as I think it is limited to 64GB of memory.
Ep.Cygni likes this.
lac is offline   Reply With Quote

Old   September 28, 2017, 15:03
Default
  #4
New Member
 
Join Date: Sep 2017
Posts: 3
Rep Power: 2
Ep.Cygni is on a distinguished road
Thank you flotus1 for informative and helpful answers.
Thanks to lac for the advice too.

Quote:
Originally Posted by flotus1 View Post
Btw: in my opinion the notion that Ryzen has to be paired with faster memory and Intel processors do not is a myth to some extent. Faster memory helps processors from both manufactures in memory-intensive workloads.
The need for high-speed RAM on new AMD platforms is a common recommendation given in Ryzen processor reviews. It is based on the fact that Infinity Fabric bus clock is fixed at RAM frequency, and with higher clocked DIMMs the overall CPU performance can notably improve in certain applications (particularly in games, as tests have shown). That's where I got my initial opinion from. However, as you mentioned in further answers, CPU's inter-core bus bandwidth is not as important for CFD as RAM bandwidth. Therefore fast RAM is required in any case, and I don't take CPU bus into account anymore.

At this point I came to the following conclusions about platform choice:
- no mainstream, because RAM bandwidth and size is not enough;
- no TR4, due do unnecessary high core count (8 and higher) for the available RAM bandwidth, high cost (motherboards are rare and expensive) and high TDP;
- no old DDR3 platforms.

Thus, I think about 5 options so far (including server ones as you suggested):
LGA2011-3 with a 6-core CPU and 4 RAM channels,
LGA2066 with a 6-core CPU and 4 RAM channels (cheaper and probably slower due to new caches),
Dual LGA2011-3 with 2x4/2x6-core CPUs and 8 RAM channels,
LGA3647 with a 6/8-core CPU and 6 RAM channels (expensive and very hard to get at the moment),
SP3 with a 8/16 core CPU and 8 RAM channels (nearly impossible to get at the moment).

Am I right with the Cores-to-RAM-channels ratio, or should we consider a 8core/4channel option too (and then TR4 socket as well)?

Regarding memory size, in case of HEDT platforms we will most likely get 64GB RAM (4x16GB) with the possibility to add another 64 GB in the future. In case of Dual LGA2011-3 or SP3, we'll need to have 8 DIMMs from the beginning to use all channels. This is a disadvantage, because 8x16GB might be out of budget right now, and 8x8 will not allow to later upgrade to max capacity by adding more DIMMs (especially with cheaper dual-socket WS motherboards that have only 8 RAM slots). Finally, LGA3647 and 6 DIMMs is more of a hypothetical option which we most likely won't be able to afford.

Last edited by Ep.Cygni; September 29, 2017 at 03:43. Reason: added EPYC option
Ep.Cygni is offline   Reply With Quote

Old   September 29, 2017, 08:39
Default
  #5
lac
New Member
 
Join Date: Apr 2016
Posts: 12
Rep Power: 3
lac is on a distinguished road
In my opinion, the HEDT platforms can be good alternative for the older dual CPU platforms (like E5 v3-v4) if you are on a tight budget, and for the memory I'd also take the fact into account that you can use faster memory (like 3200MHz). However, the bandwidth will be still less than for any dual CPU platform, as you can't OC the memory much more than this.

If I were you, I'd buy some second hand E5-v3/v4 Xeons with a brand new Supermicro board. I'd also get a board with 16 DIMM slots. This way you will have an upgrade path to more memory capacity and second hand cpus are not very expensive these days.
Ep.Cygni likes this.
lac is offline   Reply With Quote

Old   September 29, 2017, 09:00
Default
  #6
Senior Member
 
Micael Boulet
Join Date: Mar 2009
Location: Quebec, Canada
Posts: 130
Rep Power: 11
Micael is on a distinguished road
Out of curiosity, are you going to buy those commercial licences or you already have them? The system you are discussing will cost close to nothing compared to the licences.
Micael is offline   Reply With Quote

Old   September 29, 2017, 14:42
Default
  #7
Senior Member
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 1,620
Rep Power: 26
flotus1 will become famous soon enoughflotus1 will become famous soon enough
Quote:
Originally Posted by Ep.Cygni View Post
The need for high-speed RAM on new AMD platforms is a common recommendation given in Ryzen processor reviews. It is based on the fact that Infinity Fabric bus clock is fixed at RAM frequency, and with higher clocked DIMMs the overall CPU performance can notably improve in certain applications (particularly in games, as tests have shown). That's where I got my initial opinion from. However, as you mentioned in further answers, CPU's inter-core bus bandwidth is not as important for CFD as RAM bandwidth. Therefore fast RAM is required in any case, and I don't take CPU bus into account anymore.
It is true that the inter-CCX communication speed in Ryzen is linked to the memory speed and thus faster memory equals better performance.
Not really relevant to your question, but in my opinion this is only part of why fast memory is usually recommended for Ryzen. The main reason in my opinion are disappointed AMD fanboys who could not take that the benchmark results for Ryzen were significantly lower in many cpu-bound scenarios. Overclocked memory improves performance in these cases, but to some extent Intel-CPUs would also benefit from faster memory in these cases. But I digress...

Quote:
Originally Posted by Ep.Cygni View Post
At this point I came to the following conclusions about platform choice:
- no mainstream, because RAM bandwidth and size is not enough;
- no TR4, due do unnecessary high core count (8 and higher) for the available RAM bandwidth, high cost (motherboards are rare and expensive) and high TDP;
- no old DDR3 platforms.
No mainstream
-> true that

no TR4
-> not necessarily, it has its pros and cons. I would not say that its power consumption is too high considering the performance. Higher core counts are not really a problem as long as you don't have to pay for parallel licenses. The thing is that the additional cores are less effective, but the simulation still runs faster. Only if you pay thousands of dollars for each additional parallel license you have to use more CPUs with lower core count.

no old DDR3 ->
I strongly disagree. An old dual-socket workstation (Xeon E5-26xx "v1" or v2) is still one of the most cost-efficient ways to get a powerful CFD workstations. Mainly for two reasons: the CPUs and the DDR3 reg ECC are pretty cheap as long as you buy them used. These two components can be bought used because they fail very rarely. This is my go-to option for a cheap CFD workstation when enough parallel licenses are available. I recently put together a 16-core workstation with 256GB of RAM for less than 1000.


Quote:
Originally Posted by Ep.Cygni View Post
Am I right with the Cores-to-RAM-channels ratio, or should we consider a 8core/4channel option too (and then TR4 socket as well)?
For the HEDT Platforms you can go for CPUs with at least 8 cores, the total price/performance ratio of the workstation will be better. Especially since DDR4 RAM is quite expensive these days.


Quote:
Originally Posted by Ep.Cygni View Post
Regarding memory size, in case of HEDT platforms we will most likely get 64GB RAM (4x16GB) with the possibility to add another 64 GB in the future. In case of Dual LGA2011-3 or SP3, we'll need to have 8 DIMMs from the beginning to use all channels. This is a disadvantage, because 8x16GB might be out of budget right now, and 8x8 will not allow to later upgrade to max capacity by adding more DIMMs (especially with cheaper dual-socket WS motherboards that have only 8 RAM slots). Finally, LGA3647 and 6 DIMMs is more of a hypothetical option which we most likely won't be able to afford.
Whatever dual-socket board you buy, just make sure it has 16 DIMM slots. This will allow for future memory upgrades. If dual LGA 3647 is too expensive you should consider used CPUs for dual-socket 2011-3. Their retail prices did not drop at all and you get less CFD performance/$ than with LGA 3647.
lac and Ep.Cygni like this.
__________________
Please do not send me CFD-related questions via PM
flotus1 is offline   Reply With Quote

Old   October 2, 2017, 19:20
Default
  #8
New Member
 
Join Date: Sep 2017
Posts: 3
Rep Power: 2
Ep.Cygni is on a distinguished road
Thanks for the replies.

@lac:
I agree, HEDT platforms seem to be a good and probably our only choice. I showed estimated prices of several example setups to my collegue and he said he could afford high-end, but not server - even with used CPUs it is still too expensive (but we might consider getting a dual-socket motherboard and installing 1 CPU and 4 DIMMs, and upgrade later if necessary).

@Micael:
The institute my collegue works in already has a license.

@flotus1:
Good points. Indeed, while Xeon v3/v4 platform is still expensive with used CPUs and new RAM (it's hard to find any used DDR4), used v1/v2 Xeons and DDR3 DIMMs are ubiquitous and cheap - a 128GB/12(16)-core Dual LGA2011 setup can fit in the same price range as a new HEDT. However, I am concerned about a few things:

1. Most server DDR3 memory is 1333 or 1600 MHz, which won't allow for bandwidth as great as with high-clocked DDR4 in a HEDT motherboard, or with potential 8 DDR4 channels in Dual LGA2011-3.
2. The CPUs are also a bit slower and have higher TDP.
3. Since this will be a home machine, I want to make it as silent as possible. This could be more difficult and expensive with a server platform, which requires an EATX case, a more powerful PSU, and two CPU coolers that are efficient, quiet and small enough to fit near each other. For narrow ILM the only choice are quite costly Noctua models, and I wonder how noisy they are in real life.
4. No warranty for used parts and often no cashback, while BIOS/CPU/MB/RAM compatibility issues or damaged hardware is always slightly possible.

These factors could make HEDT more preferable, but it's too early to decide without knowing what is available. A bit later I will suggest a few possible setups with local prices and then ask for opinions again.
Ep.Cygni is offline   Reply With Quote

Old   October 3, 2017, 05:27
Default
  #9
Senior Member
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 1,620
Rep Power: 26
flotus1 will become famous soon enoughflotus1 will become famous soon enough
My choice for a dual-socket LGA2011 motherboard is the Asrock EP2C602-4L/D16 , mainly for three reasons:
  • still available today and for a reasonable price of ~350€
  • uses standard square-ilm cooler mounts
  • allows memory tweaking which most competitors do not
So you can buy it new with full warranty, use most standard air coolers and use cheap memory.
Standard DDR3-1600 reg ECC modules usually run at DDR3-1866 without any issues. The same applies to DDR3L-1333 once you increase the voltage to 1.5V which is still within the specifications of the memory. Of course you will need a "v2" Xeon to reach this memory frequency. This way you can save a lot of money by avoiding expensive DDR3-1866.
A workstation like this can be more silent than anything DELL or HP sell you off the shelf, just don't cheap out on the important parts. You would have to buy in the same price range anyway no matter if you go for a single- or a dual-socket solution.

For CPU coolers use Noctua NH-U14s. They might seem expensive but are worth every cent. For cooling an 8- or 10-core X299 processor you would need a single cooler in the same price range as two of these. Btw. Noctua runs an outlet store on ebay, at least in germany: http://www.ebay.de/itm/Noctua-NH-U14...0AAOSwqu9VN3Pn
For power supply I would recommend e.g. Bequiet Dark power Pro P11 550W. Again, not the cheapest but worth your money. A quality power supply like this can be re-used once you upgrade the other parts of the workstation, so the money is not wasted.
Same applies to the case. Get one in the price range 100€ or above, it will last several hardware generations and allow for quiet cooling. For example a Nanoxia Deep Silence 5 rev. B.

The only used parts here are CPUs and memory. They come cheap anyway and again, they rarely fail. What you should avoid buying used are Motherboards and power supplies.
Of course this is a bit slower than dual-LGA2011-3, but also much less expensive. Once DDR4 and the newer Xeon processors get cheaper you can still sell the old motherboard, CPUs and memory and upgrade to a newer generation while keeping most of the other hardware. Since your budget seems to be limited and your parallel licenses are not, I thought you should know about this option.

Not trying to convince you that dual-LGA2011 is the only way to go. If you want to use a modern HEDT platform that will also work. I just wanted to point out a less obvious option that is relatively cheap especially if you need more than 64GB of RAM.
scipy and Ep.Cygni like this.
__________________
Please do not send me CFD-related questions via PM

Last edited by flotus1; October 4, 2017 at 11:49.
flotus1 is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Error compiling OpenFOAM-1.6-ext Canesin OpenFOAM Installation 137 January 20, 2016 15:56
Error in reading Fluent 13 case file in fluent 12 apurv FLUENT 2 July 12, 2013 07:46
Paraview Compiling Error (OpenFOAM 2.1.x + openSUSE 12.2) sfigato OpenFOAM Installation 22 January 31, 2013 11:16
HELP: building Fluent application quiri FLUENT 2 October 27, 2011 10:35
Compilation error OF1.5-dev on Suse10.3 darenyang OpenFOAM Installation 0 April 29, 2009 04:55


All times are GMT -4. The time now is 22:40.