CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

GPU acceleration in Ansys Fluent

Register Blogs Community New Posts Updated Threads Search

Like Tree71Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   August 27, 2018, 11:05
Default
  #41
Member
 
Join Date: Dec 2016
Posts: 44
Rep Power: 9
Duke711 is on a distinguished road
Not only that, the solution process will probably break off, because of an error.


http://www.cadfem.de/fileadmin/CADFE...CADFEM_GPU.pdf
Duke711 is offline   Reply With Quote

Old   August 27, 2018, 11:14
Default
  #42
Member
 
Join Date: Jun 2010
Posts: 77
Rep Power: 15
Echidna is on a distinguished road
Can i use a Quadro K6000 plus a Tesla K80? Will they work together?
Echidna is offline   Reply With Quote

Old   August 27, 2018, 11:32
Default
  #43
Senior Member
 
Micael
Join Date: Mar 2009
Location: Canada
Posts: 156
Rep Power: 18
Micael is on a distinguished road
Flow Setup:
as OP

Software/Hardware:
Operating system: CentOS Linux 7
Fluent version: Ansys Fluent 19.1
CPU: Dual Xeon Gold 6150, HT disabled
Memory: 192 GB DDR4-2666 ECC (12 dimm x 16 GB)
GPU: 4 x V100-32GB NVLINK

Did only Second Precision.

32-core
Simple None-GPU: 4.1 s
Simple 1-GPU: 4.1 s
Simple 4-GPU: 4.1 s
Coupled None-GPU: 13.2 s
Coupled 1-GPU: 28.1 s
Coupled 2-GPU: 24.2 s
Coupled 4-GPU: 22.2 s

4-core
Simple None-GPU: 22.6 s
Coupled None-GPU: 67.5 s
Coupled 1-GPU: 48.4 s
Coupled 2-GPU: 44.7 s
Coupled 4-GPU: 42.1 s

1-core
Simple None-GPU: 89.0 s
Coupled None-GPU: 273.5 s
Coupled 1-GPU: 132.1 s
Micael is offline   Reply With Quote

Old   August 27, 2018, 12:16
Default
  #44
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Great, finally some decent hardware. Would you mind running a larger case (coupled+DP would be enough)? I only chose such a small one due to the lack of VRAM on the GPUs I had available at the time. Would be interesting to see if you can get some GPU scaling going while running 32 CPU cores.
flotus1 is offline   Reply With Quote

Old   August 27, 2018, 13:13
Default
  #45
Member
 
Join Date: Dec 2016
Posts: 44
Rep Power: 9
Duke711 is on a distinguished road
Quote:
Originally Posted by Echidna View Post
Can i use a Quadro K6000 plus a Tesla K80? Will they work together?



12 vs 24 GB, i dont know, try it
Duke711 is offline   Reply With Quote

Old   August 29, 2018, 11:58
Default
  #46
Senior Member
 
Micael
Join Date: Mar 2009
Location: Canada
Posts: 156
Rep Power: 18
Micael is on a distinguished road
Flow Setup:
as OP
excepted mesh is 215 x 215 x 215 (10M cells)

Software/Hardware:
Operating system: CentOS Linux 7
Fluent version: Ansys Fluent 19.1
CPU: Dual Xeon Gold 6150, HT disabled
Memory: 192 GB DDR4-2666 ECC (12 dimm x 16 GB)
GPU: 4 x V100-32GB NVLINK

Did only Second Precision.

32-core
Coupled None-GPU: 577 s
Coupled 1-GPU: Failed, apparently out of memory
Coupled 2-GPU: 541 s
Coupled 4-GPU: 394 s
Micael is offline   Reply With Quote

Old   April 12, 2019, 18:02
Default I am using TITAN V and I Cannot load this GPU with my simulation
  #47
New Member
 
MIguel Rodriguez
Join Date: Jan 2017
Posts: 1
Rep Power: 0
mrodriguez is on a distinguished road
Quote:
Originally Posted by KEDELLE View Post
Did u use Titan V or Titan X?
I bough a Titan V. I did not know that this card does not work with ansys fluent. Someone ask you about how to resolve this problem? Is it possible load this GPU with a simulation? I followed every step to activating GPU, but it does not work. Help me pleased.
mrodriguez is offline   Reply With Quote

Old   April 12, 2019, 19:15
Default
  #48
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Seems like this topic comes around again every now and then...
If the solver you are using is not utilizing the GPU despite it being activated in the Fluent launcher, then you won't see much benefit from the GPU anyway.
You could force Fluent to use the GPU with some TUI commands, but again, expect to see no improvement or even worse performance with a GPU enabled in these cases. https://www.sharcnet.ca/Software/Ans...-EC933A7E.html
Or maybe Ansys decided to use a whitelist for GPUs in Fluent just like they did with some of their other software. It's been a while since I last used it.
flotus1 is offline   Reply With Quote

Old   June 1, 2020, 16:33
Default
  #49
New Member
 
sida
Join Date: Dec 2019
Posts: 6
Rep Power: 6
sida is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
If you feel that any specific aspect is missing or conclusions drawn are flawed I would recommend addressing it directly.
I will gladly re-run or add a few benchmarks with Tesla V100. Contact me through PN if you want to send over a few samples
Hi

I'm very eager to see the result of your tests with Tesla V100. Before reading your posts, I was going to combine Threadripper 3970x with Quadro RTX 4000, but now that GPU acceleration is not as effective/justifiable as advertised, what alternative do you suggest for Quadro RTX 4000?
sida is offline   Reply With Quote

Old   June 1, 2020, 17:59
Default
  #50
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
It should come as no surprise that I never got any samples. I was not really expecting that.

I don't have any alternative in the price range of an Quadro RTX 4000 card. Well Nvidia doesn't, but that is splitting hairs. GPU acceleration with Ansys products is for people with a virtually unlimited hardware budget, due to the fact that software, engineers and development time are so much more expensive than a workstation. My advice to everyone else is to focus on CPU performance first.
If you really want to do GPU acceleration on a budget, try used Quadro K6000 cards. They can be found for around 300-400$. That is of course if you want to do double precision. With single precision, any semi-recent CUDA capable card should do. The consumer cards offer much better value than the Quadro and Tesla lineup here.
lev likes this.
flotus1 is offline   Reply With Quote

Old   June 2, 2020, 00:38
Default
  #51
New Member
 
sida
Join Date: Dec 2019
Posts: 6
Rep Power: 6
sida is on a distinguished road
Thanks for the quick response, it was a relief after days of research

Best
sida is offline   Reply With Quote

Old   June 2, 2020, 01:34
Default
  #52
New Member
 
sida
Join Date: Dec 2019
Posts: 6
Rep Power: 6
sida is on a distinguished road
Also, this article, using Openfoam, can help us understand that an investment in CPU is much more reliable compared to an investment in GPGPU, at least when it comes to cfd.


Multi GPU Implementation to Accelerate
the CFD Simulation of a 3D Turbo-Machinery
Benchmark Using the RapidCFD Library

https://link.springer.com/chapter/10...030-38043-4_15
lev likes this.
sida is offline   Reply With Quote

Old   February 27, 2021, 11:28
Default
  #53
New Member
 
Bhanuday Sharma
Join Date: Jun 2015
Posts: 18
Rep Power: 10
bhanuday.sharma is on a distinguished road
It would have been helpful if you could uploaded your .cas / mesh file. So, that other users quickly test their system configuration.
bhanuday.sharma is offline   Reply With Quote

Old   April 28, 2023, 09:08
Default Summa summarum
  #54
Member
 
Stabum's Avatar
 
Join Date: Aug 2012
Location: Italy
Posts: 66
Rep Power: 13
Stabum is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
The topic of GPU acceleration for Ansys Fluent sometimes seems to be shrouded in mystery. So I ran a few benchmarks to answer some frequently asked questions and get a snapshot of the capability of this feature in 2017.

...

Edit: here is a nearly exhaustive list of Nvidia GPUs with high DP capabilities:
Please I kindly ask you to correct me if I'm summarizing it too roughly:

In case of medium parallelization (max 64/128 cores), GPGPU can be convenient only if:

1) you're using coupled algorithms;
AND
2) you're using powerful graphic cards (Quadro5000 or above).

Given all this, GPU RAM must be big enough to contain the mesh of the problem you're going to study. This means that, if your average mesh requires around 64 GB of RAM (I understand that it can sound quite small for some of you guys, but for those who don't work at NASA it's pretty much!), and you have planned to adopt Quadro RTX 5000s (16GB each), you should have a quad SLI or more...

Given all this, it's absolutely impossible for a "normal" user to make use of GPGPU technology in CFD.

Many thanks,
C.
Stabum is offline   Reply With Quote

Old   April 28, 2023, 10:27
Default
  #55
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Some things changed since I originally posted my little experiment.
For example, Ansys now has a native GPU solver, which allegedly runs much faster. I can not comment on that claim.

Some things didn't change though, at least not for the better. Commercial GPU solvers -native or otherwise- don't have feature parity with the established CPU counterparts. If you have everything you need, or are willing to change your workflow to accommodate the missing features, maybe it is for you.
GPU memory is still a scarce resource. Nvidia now sells cards with 80GB of VRAM (e.g. H100), but of course these are at the high end for data centers.
And noteworthy FP64 performance is reserved for very few products at the high end. Everything else is cut down to a 1:32 divider for FP64.

My main motivation for writing this article in the first place was this: people here regularly inquired "which graphics card should I buy to get good acceleration in my new Fluent workstation. My total budget is -insert figure below 10000€-"
For the vast majority of cases, the answer is just stick to maximizing CPU performance.

GPU acceleration or computation with commercial CFD solvers is for data centers. The hardware is just too expensive to make it work in any other setting. This is a trend that has only accelerated over the last few years.
Or to put it very bluntly: if you have to ask me -a random stranger on the internet- for advice, you probably should not bother with GPUs

Of course, if you like to tinker with used hardware, don't let me stop you. P100 go for less than 300€ on ebay these days
oswald, Stabum and wkernkamp like this.
flotus1 is offline   Reply With Quote

Old   May 11, 2023, 08:31
Default
  #56
Senior Member
 
Arjun
Join Date: Mar 2009
Location: Nurenberg, Germany
Posts: 1,273
Rep Power: 34
arjun will become famous soon enougharjun will become famous soon enough
Quote:
Originally Posted by flotus1 View Post

GPU acceleration or computation with commercial CFD solvers is for data centers. The hardware is just too expensive to make it work in any other setting.
I agree with you pretty much everything here except this one bit (here i do not completely agree).

I understand this perception comes from presentations of Ansys and Siemens where they are showing results from top of the line GPUs that a normal person would not have on desktop. So yaa if the user is only confining itself to these limited few names then what you said is very true.


But here with Wildkatze i try to focus on what layman could have on the desktop and what we can gain out of it. Even with my old 2080ti I am able to gain almost 25 to 30x of speed up. Now if i have to pick current GPU like 40XX series this scaling would go long way and this a normal user can afford.


The only problem that i could see is that people still won't use the solver because they do not know the name (people use what they know of and don't want to try anything new). But if they want it then they can get good speed up from GPU here.
arjun is offline   Reply With Quote

Old   May 11, 2023, 11:21
Default
  #57
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
I find it very commendable that you put in the effort to make it work with hardware we can actually get our hands on.
But a 25x speedup from a 2080TI compared to CPU begs the question: what CPU are you comparing to? And are we talking multi-threaded or single-threaded.
Please don't take this the wrong way, but such outrageously high speedups from GPU acceleration, when comparing to a reasonably modern CPU, usually get you a few raised eyebrows in the HPC community. Because it usually means that the CPU implementation simply does not have the same level of optimization as the GPU implementation.
When looking at the raw specs like theoretical FP32 operations per second, or memory bandwidth, there is not a 25x gap between CPUs and GPUs. At least leaving aside hardware accelerated operations.
flotus1 is offline   Reply With Quote

Old   May 11, 2023, 11:28
Default Rtx a6000
  #58
New Member
 
Join Date: Jun 2018
Posts: 4
Rep Power: 7
KEDELLE is on a distinguished road
I have a Rtxa6000 that I barely used for sale at 4500

What you need is vram size so the cad mesh don’t have to be continuously broken up and sent back and forth from the ssd to the cpu to the gpu
KEDELLE is offline   Reply With Quote

Old   May 11, 2023, 12:29
Default
  #59
Senior Member
 
Arjun
Join Date: Mar 2009
Location: Nurenberg, Germany
Posts: 1,273
Rep Power: 34
arjun will become famous soon enougharjun will become famous soon enough
Quote:
Originally Posted by flotus1 View Post
I find it very commendable that you put in the effort to make it work with hardware we can actually get our hands on.
But a 25x speedup from a 2080TI compared to CPU begs the question: what CPU are you comparing to? And are we talking multi-threaded or single-threaded.
Please don't take this the wrong way, but such outrageously high speedups from GPU acceleration, when comparing to a reasonably modern CPU, usually get you a few raised eyebrows in the HPC community. Because it usually means that the CPU implementation simply does not have the same level of optimization as the GPU implementation.
When looking at the raw specs like theoretical FP32 operations per second, or memory bandwidth, there is not a 25x gap between CPUs and GPUs. At least leaving aside hardware accelerated operations.

I was very casual so did not think of writing cpu etc. The CPU here is AMD 2990WX 32 Core. This i am sure you know of. The machine has 128GB RAM but the whole thing was run on GPU with double precision.

The idea that i am working on is to make a gpu engine where people should be able to run the case from different solvers. At the moment it runs from two software i have. If someone help me with openfoam loader (a translater) it shall be able to run those cases too. Thats the idea so far.


Edited to add: So far in our testing we are within 5% of starccm as cost per iteration. So this shall give some idea about the implementation (we usually take less iterations to converge compare to starccm with fluent never could compare).
wkernkamp likes this.
arjun is offline   Reply With Quote

Old   May 11, 2023, 14:20
Default
  #60
Senior Member
 
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 498
Rep Power: 20
JBeilke is on a distinguished road
I'm not sure about the right terminology in GPU computing but Arjuns implementation comes without domain decomposition. So just one single processor-domain. This seems to be a good way to accelerate cases, which have a limited potential for a parallel speedup.

I have seen the MotorBike benchmark running on his machine with a preliminary version of the code and it was more than impressive.
arjun likes this.
JBeilke is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
[Resolved] GPU on Fluent Daveo643 FLUENT 4 March 7, 2018 08:02
How to open Icem mesh in Ansys Fluent? emmkell FLUENT 27 February 6, 2018 03:34
Can you help me with a problem in ansys static structural solver? sourabh.porwal Structural Mechanics 0 March 27, 2016 17:07
Running UDF with Supercomputer roi247 FLUENT 4 October 15, 2015 13:41
Ansys structural and fluent for FSI assafwei FLUENT 1 June 20, 2014 10:56


All times are GMT -4. The time now is 19:32.