CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

New System for Fluent

Register Blogs Community New Posts Updated Threads Search

Like Tree5Likes
  • 1 Post By flotus1
  • 2 Post By flotus1
  • 2 Post By DL598

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   May 1, 2020, 14:37
Default New System for Fluent
  #1
New Member
 
Join Date: Feb 2020
Posts: 18
Rep Power: 6
DL598 is on a distinguished road
Hello,
I'm part of a FSAE Team building a formula style race car.
We want to request some monetary resources from our university to buy/build a new rig to run our CFD simulations on. We mainly simulate the Aerodynamics of the car using ANSYS Fluent. For a straight line case the mesh usually consists of about 30 million cells, for a vehicle in yaw simulation the mesh usually consists of about 40 million cells which currently is the maximum we can run without running out of RAM. To get the yaw simulations to this count of cells we have to increase the size of the cells on mosts surfaces which isn't ideal. Additionally, the straight line simulations need about 12 hours to reach convergence after about 225 iterations and the yaw simulations need about 18-24 hours to reach convergence after about 275 iterations. Also Post-Processing (Fluent and CFD-Post) and the Viewer in Fluent on the current machine is appalling since it is really laggy to spin the domain around in the viewer even still when we're still only meshing it in Fluent and haven't simulated anything. Basically creating any contour or Iso-Surface in Fluent takes ages and it also takes ages to create an sweep animation in CFD-Post and export it as a video since it takes about 30-60 seconds to create a single contour.
Now we want to reduce this time of the simulations so we can run more and push the development more. Also the system should be future proof for the next 3-5 years, needing no upgrades. Ideally meshes with about 80 mio cells should be possible to simulate in an reasonable amount of time but I don't know if that is demanding too much for the budget.
Currently we are using an old Dell Precision T7500 Workstation with 2x Intel Xeon X5690 Six-Cores, and 192 gb 1333MHz ddr3 ram divided into 12 sticks. For storage we use a hdd since the generated data would just bury an ssd setup pretty fast.
We are currently looking at a budget of about 7000€ for an entirely new system. But if something reasonable can be had for less, it's always easier to get it approved if it's less expensive.
After some research I've pretty much ruled out most of the Intel CPU's based on performance and price. Currently I'm looking at something like a AMD Ryzen Threadripper 3970X with 256gb of ram divided into 8 sticks or something like 2x amd epyc 7302 also paired with 256gb of ram divided into 16 sticks. Although even something like a Ryzen Threadripper 3990x would be possible in the budget. For the rest of the system it's all pretty basic stuff like a Nvidia gtx 1660 as a graphics card which should be plenty since we're not using it to solve any cases. some HDD drives for storage (8TB in total), an SSD for the OS and Ansys (500gb), 850W PSU with two 8-Pin EPS connectors for a dual cpu setup or the possibility to switch to one later on. An air cooler for the cpu('s) to increase reliability over water cooling and still trying to keep the cpu('s) turboing to the max turbo clock. And for the rest pretty much the cheapest stuff available that still ensures the functionality we want.
I want to populate all of the ram slots as to max out the usage of the ram channels since i've read that this increases the performance. Is this true? Also I've read that one should install 8gb of ram for every core when using Fluent. This would make the 3990x problematic since that would mean it needs 512gb of ram but is only capable of handling 256gb. Also I'm not quite sure how much the frequency and CL rating of the ram influence the performance of the system. Additionally I've been comparing the performance of the CPU's by calculating the effective frequency (core count*frequency) without accounting for Cache size since i couldn't find anything on how it influences the CFD-performance. I've also been disregarding Hyperthreading or whatever AMD calls it completely since you always hear that fluent only cares about real cores and not virtual ones.
Could you please give me some feedback if the setups I'm proposing are adequate and if there are any other parameters i should watch out for? If you need any more Information I'd be happy to supply it.
Thanks in advance!
DL598 is offline   Reply With Quote

Old   May 1, 2020, 17:04
Default
  #2
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Forget about any AMD Threadripper CPU. It's a huge waste of money for your requirements, and not a substantial upgrade compared to your current setup.
Aggregate CPU frequency, i.e. multiplying cores with frequency, is a useless metric for CFD applications. You also need memory bandwidth to go along with it, and AMD TR CPUs are severely lacking in this category.
Same for recommendations of any amount of memory per core. First and foremost, you need enough memory to fit your simulations. Since you want to double the cell count to 80 million, you need at least twice the memory compared to your current setup. Going with a dual Epyc build, this means 512 GB of RAM. 384GB is not recommended for these CPUs, because it would require an unbalanced memory population.

So let's reverse-engineer your workstation from this data point.
The cheapest 16x32GB DDR4-3200 reg ECC will cost you around 3000€
Power supply: 130€
Case: 100€
Mainboard: 600€
CPU coolers: 160€
System SSD: 80€
Graphics card: 210€
3x 6TB non-SMR hard drives: 600€

That's 4880€ total. A maximum budget of 7000€ leaves you 2120€ for the CPUs. Stretching the budget a little bit, you could fit two 16-core AMD Epyc 7302 CPUs. If you can stretch it a bit further, and your license allows you to use 48 cores total, the 24-core Epyc 7352 would be a nice upgrade.
Before buying, make sure if you don't have access to computing resources through your university
flotus1 is offline   Reply With Quote

Old   May 1, 2020, 17:26
Default
  #3
New Member
 
Join Date: Feb 2020
Posts: 18
Rep Power: 6
DL598 is on a distinguished road
I've looked through my excel table again and have come up with this:
CPU
2x AMD Epyc 7551 (64 Cores total)
-> 1289€*2=2578€

RAM
512GB Samsung ECC 2666MHz 16 Sticks
-> 134,19€*16=2147,04€

Mainboard
Supermicro H11DSi
-> 545,22€

GPU
PNY Quadro P4000
-> 863€
or
PNY Quadro RTX 4000 (which isn't listed under the supported GPU's for solving on them by Ansys but should still work I guess)
-> 942,89

CPU-Cooler
2x Thermalright Silver Arrow TR4
-> 85,30€*2=170,60€

PSU
Corsair HX Series HX850
-> 167,90€

HDD
2x Western Digital WD Blue 4TB
-> 92,50€*2=185€

SSD
SanDisk Ultra 3D 512GB
-> 69,99€

Case
Phanteks Enthoo Pro
-> 99,90€

In Total: 6826,65€ or 6906,54€ depending on the GPU

This would leave us with 64 Cores, 512GB RAM, 8TB of Storage and the capability to use the GPU in solving. What do you think about this?
DL598 is offline   Reply With Quote

Old   May 1, 2020, 17:48
Default
  #4
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
I don't feel comfortable recommending first gen Epyc for your requirements.
It will be significantly slower compared to second gen for mesh generation and post-processing. These are lightly-threaded tasks that still require lots of memory. Which is the worst-case scenario for the complicated NUMA topology of first gen Epyc.
You stated that a GTX 1660 would be enough for you. I strongly advise against paying the premium for Quadro GPUs. It's just not worth it.
On the topic of using any Quadro GPU for solving: forget it. Neither do these "cheap" Quadro cards have any double precision performance worth mentioning, nor do they have enough memory for your huge models. They will do just as bad as a comparable GTX card for half the price.
WD Blue 4TB are most likely SMR HDDs. I might be overly sensitive about this topic due to the recent scandals. But I definitely would not want SMR hard drives for an application where I opted against SSDs, only due to budget constraints.
flotus1 is offline   Reply With Quote

Old   May 1, 2020, 18:22
Default
  #5
New Member
 
Join Date: Feb 2020
Posts: 18
Rep Power: 6
DL598 is on a distinguished road
Ok, so Rome it is.
How much does the speed of the RAM influence the performance? Can we save a little bit of money by not using 3200MHz and instead opting for 2933MHz or even 2666MHz or will this bottleneck us?
As for the SMR HDD, I hadn't even heard of that until now but if I search in my comparison portal of choice for HDD's I can tick a non-SMR otion and this still returns drives with similar pricing so that shouldn't be too much of a concern, I think.
As for the GPU I just guessed that a GTX 1660 should be enough this is just based on a gut feeling. If you have any more information on how i determine what kind of GPU I need, I would be very grateful. After your information a Quadro is out of the question and that money can be better invested into other components.
As it stands now I modified my Setup for a configuration with 2x 7352's, a GTX1660 and 512GB of 3200MHz RAM for 7904,57€ and one using the same GPU and RAM but with two 7302's for 7023,35€. I'm now wondering if those 900€ really grant a significant increase in performance?
DL598 is offline   Reply With Quote

Old   May 1, 2020, 18:42
Default
  #6
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
https://geizhals.eu/?cat=ramddr3&xf=...=p#productlist
Looking at the current market rates of DDR4 reg ECC memory, dropping down to DDR4-2933 just isn't worth it. You could save a significant amount of money dropping even lower to 2666. But this will also translate into a significant performance decrease. It's not 1:1, especially with lower core counts like 16 per CPU. But for the 24-core CPUs, you can expect somewhere around 15% lower performance compared to DDR4-3200.

For the GPU, you probably want to get the maximum amount of VRAM first, and worry about raw performance later. If a scene doesn't for into VRAM, performance during GUI interactions will suffer a lot. A simple lack of raw performance will make the experience less smooth, but at least still usable. What's the graphics card in your current system?
On the AMD side of things, the RX 570 8GB is still a great budget option.
Nvidia has the GTX 1660 with 6GB of VRAM. I I personally would prefer a GTX 1070 8GB over that, but they have become hard to find new at reasonable prices.
Any significant upgrade would be much more expensive.

There are some nice strong scaling graphs for Ansys Fluent found here: Xeon Gold Cascade Lake vs Epyc Rome - CFX & Fluent - Benchmarks (Windows Server 2019)
As you can see, scaling has not stopped yet at 32 cores with 2xEpyc 7302. And some of the less than ideal scaling can be attributed to the higher single-core boost clock. So you can still expect some significant performance increase from 50% more cores. Pay 13% more for the whole system, get ~30% more performance. An absolute bargain, especially at the higher end.
flotus1 is offline   Reply With Quote

Old   May 1, 2020, 19:08
Default
  #7
New Member
 
Join Date: Feb 2020
Posts: 18
Rep Power: 6
DL598 is on a distinguished road
I have to say you are really helpful and I have to thank you for your feedback.

Currently we are using a Quadro 4000.

As for a GTX 1070 I think it's a little out of budget if as you said VRam is more important that raw power. If a RX570 8GB is enough then that would be great since it is even cheaper than a 1660.

Having just brushed over the initial post in the thread you linked it is clear that 7352's are the way to go. I'll definetly read that thread in full when I've got the time. But since it's already 1AM that's not going to be today.

So to summarize right now we've landed at
2x7352
512gb 3200MHz DDR4 ECC
RX570 8GB
and the rest being the stuff I posted before
that lands us at a price of 7826,67€

Is there something where we can still save some money or do I just need to work extra hard to convince the university?
DL598 is offline   Reply With Quote

Old   May 1, 2020, 19:55
Default
  #8
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
With the Quadro 4000 2GB you currently have, an RX 570 8GB will definitely be a solid upgrade. And if it is still not enough, at least you only wasted ~130€ on it. And know that you would need something way more substantial like a used Quadro P5000 16GB.

Cutting more corners is not easy.
You can save a bit on the power supply. Seasonic Focus Plus 750W is a solid choice. ~132€ for the 80+ Platinum certified version.
Shave off a few € with cheaper CPU coolers. Maybe Noctua NH-U12S. Although I don't particularly like the fans they use here.
Assuming you will have to buy new retail parts from somewhat reputable sellers within Europe, there is not much room for improvements.

And don't forget to ask your vendor to ship a revision 2.x of Supermicro H11DSi motherboard. They still haven't released H12 dual-socket boards, and rev. 1.x of H11 boards will not run Epyc Rome CPUs. At least not without modifying the bios.
Some vendors started selling rev. 2.x specifically at ridiculous markups. Try to avoid them.

Edit: maybe an alternative cheaper solution if you can source used parts for this project. Dual Xeon E5 v3 will be much slower for solving, but it can compete for pre- and post-processing.
Home workstation for large memory demand CFD

Last edited by flotus1; May 2, 2020 at 05:29.
flotus1 is offline   Reply With Quote

Old   May 2, 2020, 10:11
Default
  #9
New Member
 
Join Date: Feb 2020
Posts: 18
Rep Power: 6
DL598 is on a distinguished road
Sourcing used parts sadly is next to impossible for us since we have to buy everything through the university and they almost never allow that.

As for the mainboard, is there a way to tell if it is the revision 2 other than the seller having it in the description or asking them? And what would be an acceptable price for a revision 2 board?

As for the cooler, I'm also not a fan of noctua fans (pun intended ) after I had one break on me over and over again. Something like a be quiet! Dark Rock Pro 4 TR4 for 73,70€ a piece that's still a few bucks less than the Thermalright ones.

That would land us with this:

CPU:
2x Epyc 7352 (48 cores total) - 3201,06€ total
or if we can't convince the university
2x Epyc 7302 (32 cores total) - 2296,18€ total

RAM:
Samsung ECC Ram 3200MHz 16 Sticks - 3248€ total

Mainboard:
Supermicro H11DSi Revision 2 - 550€-ish?

GPU:
ASRock Phantom Gaming D Radeon RX 570 8G OC - 139€

CPU-Cooler:
2x be quiet! Dark Rock Pro 4 TR4 - 73,70€ total

PSU:
Seasonic Focus Plus Platinum 750W - 131,90€

HDD:
2x Western Digital WD Blue 4TB non SMR - 171,80€ total

SSD:
SanDisk Ultra 3D 512GB - 69,99€

Case:
Phanteks Enthoo Pro - 99,90€

Total:
Epyc 7352: 7798,05€
Epyc 7302: 6893,17€
With a performance increase of 50% from the 7302 system to the 7352 system if I use the numbers of the comparison thread you linked earlier and extrapolate the up to the 48 cores. This should at least give us a good leg to stand on when arguing with the university for the 7352 config.

Anything I'm missing here or something we should watch out for?
Maybe some additional case fans? I don't know how good the Phanteks fans are.
DL598 is offline   Reply With Quote

Old   May 2, 2020, 11:45
Default
  #10
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
The fans Phanteks includes with their cases are pretty crappy. But it's probably fine for another PC in a busy office. Adding one or two extra still won't hurt.

Other than paying the premium for a board listed as rev. 2, you can only ask the seller if they will ship a revision 2 board. What I would consider a fair price for rev. 2 is exactly the same price as rev. 1. Because it is literally the same board, only with a larger chip to store the bios, that costs a few cents more to produce. The whole situation is pretty idiotic to begin with. AMD claimed users would be able to upgrade to Rome without any issues. I can only guess what went on behind the scenes here, and it's not very consumer-friendly.
Spanner likes this.
flotus1 is offline   Reply With Quote

Old   May 2, 2020, 13:17
Default
  #11
New Member
 
Join Date: Feb 2020
Posts: 18
Rep Power: 6
DL598 is on a distinguished road
Then we have a system. Thank you a lot.
One more question:
What do you estimate how much more performance would this system deliver?/How much faster would this system run our simulations?
DL598 is offline   Reply With Quote

Old   May 2, 2020, 13:52
Default
  #12
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
5x reduction of solver times seems like a conservative estimate.
flotus1 is offline   Reply With Quote

Old   May 2, 2020, 13:56
Default
  #13
New Member
 
Join Date: Feb 2020
Posts: 18
Rep Power: 6
DL598 is on a distinguished road
That would be awseome and already help a lot!

Thanks for your help!
DL598 is offline   Reply With Quote

Old   May 2, 2020, 14:27
Default
  #14
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
My pleasure. Let us know how it turned out.
flotus1 is offline   Reply With Quote

Old   May 3, 2020, 04:35
Default
  #15
Member
 
Join Date: Oct 2019
Posts: 62
Rep Power: 6
Habib-CFD is on a distinguished road
Quote:
Originally Posted by DL598 View Post
Hello,
Basically creating any contour or Iso-Surface in Fluent takes ages and it also takes ages to create an sweep animation in CFD-Post and export it as a video since it takes about 30-60 seconds to create a single contour.
Now we want to reduce this time of the simulations so we can run more and push the development more.

I am not sure about this idea but based on my short experience on post-processing like contours, streamlines, etc you can use another mainstream desktop like 10 gen Intel desktop i7-10700K CPU supporting more than 5 GHz at single-core beside the Optane memory. As I said, only for post-processing staff and not for simulation or video production.
Habib-CFD is offline   Reply With Quote

Old   May 3, 2020, 05:48
Default
  #16
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
You are right, Epyc 2nd gen is not the absolute best choice for every single task. Naturally, no CPU choice can be with a limited budget.

It's a trade-off between single-core and multi-core performance.
You get around 20-30% lower performance for single- and lightly-threaded tasks, compared to a 10th gen I9 from Intel.
But on the other hand, you get several times higher multi-core performance while solving the model.
And unless motherboard vendors like ASRock step in again, 10th gen I9 CPUs on the X299 platform have a maximum memory support of 256GB UDIMM. I7-10700k even tops out at 128GB UDIMM.
I can't see how Optane support comes into play here. Using that as a replacement for system memory, the performance advantage is gone.

Or would you favor a solution with two separate systems?
I find it hard to justify the increased cost of adding a second system, just for slightly higher single-core performance.
Spanner and Habib-CFD like this.
flotus1 is offline   Reply With Quote

Old   January 2, 2021, 17:45
Default
  #17
New Member
 
Join Date: Feb 2020
Posts: 18
Rep Power: 6
DL598 is on a distinguished road
So by now I've had plenty of time to test out the new system and I couldn't be happier with it. A straightline simulation now takes only a few hours (between 2 and 3) and even in an cornering simulation it only takes about 5 to 7 hours for the 60 million+ cell mesh to be simulated and converged. the most RAM usage I have yet seen was about 230GB so there's plenty of headroom but there is no need to increase the cell count anymore on the existing models, so this new simulation pc enabled me to increase the accuracy of the model by replacing the rotating cylinders used for the tires with full rotating rims and wheels as well as adding wheel carriers and full suspension, brake discs and a more detailed cockpit and driver model.
Overall I can only say I'm rally happy with how it turned out, we even managed to squeeze in a RTX2070 and a 1000W PSU for some future upgrades and some better fans to replace the rather noisy phanteks fans. The cooling works like a charm even with all 48 cores on the 2 CPUs pinned to 100% i have yet to see more than 70°C on any core but for anybody looking at this motherboard in this case, i can't vouch for the fitment with big air coolers as we only had a few mm of clearance to the strut which the hard drive cages connect to. but that's nothing a few cuts with a dremel wouldn't solve if you're willing to do that.
The only thing I have yet to figure out is how to get the gpu to work over a remote desktop connection, as the pc is running without a screen connected and only over windows remote desktop and the windows local group edit command for forcing windows remote desktop to use the gpu is just not there. As it stands the viewer in fluent is still very laggy, better than on the old pc but still bad, this is because the gpu isn't utilized for the remote desktop, I think. If anybody has a tip to solve this problem, or could tell me where to best post this problem I would be thankful.

To summarize I have to thank flotus1 for all the help he has given to me.
SLC and flotus1 like this.
DL598 is offline   Reply With Quote

Old   January 8, 2021, 11:56
Default
  #18
New Member
 
BALRAJ SINGH
Join Date: Feb 2018
Posts: 9
Rep Power: 8
Spanner is on a distinguished road
Sir
i have purchsed a epyc 7702(64 cores, 256mb L3) based asus rs500 e10 ps4 with 256gb ram of 2999 mhz dual rank having onboard nvme 250gb for centos and raid card for raid0 of two sas 15k drives. The application is external aero, fluid dynamics, thermal, structures using opensource tools only.
need your guidance for
1. optimum setup in bios and os
2. upgrading recommendations apart from ram.
Spanner is offline   Reply With Quote

Reply

Tags
cpu, fluent, hardware, ram


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Cooling system design with CFX, Mechanics and system coupling gmingardo CFX 12 December 2, 2016 14:26
Remeshing_ ANSYS 14.0_ System Coupling acdesa ANSYS 4 November 2, 2016 09:12
2way FSi, Initialize with steady solution, Fluent, Transient Sturcural, System Coupli mmkkeshavarzi FLUENT 0 June 22, 2016 08:26
System Build Advice for FEA cycleback Hardware 1 February 8, 2013 20:53
Need ideas-fuel delivery system Jan Fidelity CFD 0 October 9, 2006 04:30


All times are GMT -4. The time now is 23:08.