|
[Sponsors] |
![]() |
![]() |
#1 |
New Member
F
Join Date: Feb 2014
Posts: 2
Rep Power: 0 ![]() |
Hello,
My first post in this forum will be about GPUs. I'm in the process of setting up a new workstation to be able to run OpenFOAM and ANSYS for CFD and mechanical simulation and learning purposes. I would really like GPU for accelerated solution times. These days the GeForce GTX Titan Black are available in stores. This graphic card has the GK110 core which is the same as the Quadro 6000 and Tesla K20x etc.. The double precision rate is 1,7 TFLOPS which is actually higher than the Tesla K20X which has 1,3 TFLOPS. I know the ECC memory is not supported on the GTX card. At this moment cost is more of an issue than the ECC support. My question is does anyone know if the GTX will work as a GPU in ANSYS and Open FOAM? Thanks, Frode |
|
![]() |
![]() |
![]() |
![]() |
#2 |
Senior Member
Join Date: Mar 2009
Location: Austin, TX
Posts: 160
Rep Power: 19 ![]() |
GPU accelerated CFD still isn't ready to use for real work. Some codes support it, and there are several plugins for OpenFOAM, but the actual speedup is minimal (and often times negative).
It's much more cost effective to just buy a second workstation and connect them via ethernet. Infiniband doesn't become necessary until you get to 4+ machines. That card you are talking about costs $1000+. Instead you could get a $200 GPU and a $800 second node, which would essentially halve your solution times. |
|
![]() |
![]() |
![]() |
![]() |
#3 |
Senior Member
Erik
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 1,193
Rep Power: 24 ![]() |
I have never tried a Titan, but My GTX680 doesn't work with ANSYS. It says I must have a Tesla or Quadro card, that my GPU is not supported. ANSYS works with NVIDIA, so I'm almost positive they would force you to buy their "professional" GPU compute card.
|
|
![]() |
![]() |
![]() |
![]() |
#4 |
New Member
Join Date: Mar 2009
Posts: 10
Rep Power: 18 ![]() |
I was also curios about Fluent and gpgpu, so I tried to run a case (3ddp, pbns, rke, coupled solver, 41K cells) on Windows 7 PC (i7 3770K) with NVIDIA GeForce GTX 670 graphics card (1344 CUDA cores).
The fluent command /parallel/gpgpu/show returned: Code:
CUDA visible GPUs on rocky CUDA runtime version 5000 Driver version 6000 Number of GPUs 1 0. GeForce GTX 670 (*) 7 SMs 0.98 GHz 2.14748 GBytes Code:
Performance Timer for 56 iterations on 2 compute nodes Average wall-clock time per iteration: 1.145 sec Global reductions per iteration: 31 ops Global reductions time per iteration: 0.000 sec (0.0%) Message count per iteration: 62 messages Data transfer per iteration: 0.232 MB LE solves per iteration: 2 solves LE wall-clock time per iteration: 0.011 sec (0.9%) LE global solves per iteration: 2 solves LE global wall-clock time per iteration: 0.000 sec (0.0%) LE global matrix maximum size: 41000 AMG cycles per iteration: 4.107 cycles Relaxation sweeps per iteration: 384 sweeps Relaxation exchanges per iteration: 387 exchanges Total wall-clock time: 64.102 sec Total CPU time: 127.531 sec Code:
Performance Timer for 56 iterations on 2 compute nodes Average wall-clock time per iteration: 0.218 sec Global reductions per iteration: 44 ops Global reductions time per iteration: 0.000 sec (0.0%) Message count per iteration: 441 messages Data transfer per iteration: 0.472 MB LE solves per iteration: 3 solves LE wall-clock time per iteration: 0.142 sec (65.0%) LE global solves per iteration: 6 solves LE global wall-clock time per iteration: 0.000 sec (0.1%) LE global matrix maximum size: 22 AMG cycles per iteration: 8.286 cycles Relaxation sweeps per iteration: 566 sweeps Relaxation exchanges per iteration: 572 exchanges Total wall-clock time: 12.210 sec Total CPU time: 24.399 sec |
|
![]() |
![]() |
![]() |
![]() |
#5 | |
New Member
F
Join Date: Feb 2014
Posts: 2
Rep Power: 0 ![]() |
How is the gtx 670 double precision rate? I think is is rather poor. It also seems like your analysis is rather small given the short solve time. I've read that gpu only will be beneficial in larger analysis.. Did you just activate it in ansys and it worked? Is it possible for you to test if mechanical also accept the card?
Quote:
|
||
![]() |
![]() |
![]() |
![]() |
#6 | ||
New Member
Join Date: Mar 2009
Posts: 10
Rep Power: 18 ![]() |
Quote:
Code:
> it 500 iter continuity x-velocity y-velocity z-velocity k epsilon time/iter AMG on GPGPU NVAMG version 4 Built on Aug 21 2013, 10:28:27 NVAMG ERROR: file ../../src/amg_gpu.c line 863 NVAMG ERROR: CUDA kernel launch error Yes, I've used command fluent 3ddp -g -t2 -gpgpu=1. Quote:
I also discovered folowing: if I start fluent, read the case, initialize and run calculation, it takes 59 seconds to reach convergence criteria. When the case is reread again, initialized and calculation started, it takes only 7.5 seconds. |
|||
![]() |
![]() |
![]() |
![]() |
#7 |
Senior Member
Erik
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 1,193
Rep Power: 24 ![]() |
||
![]() |
![]() |
![]() |
![]() |
#8 |
New Member
Join Date: Mar 2013
Posts: 1
Rep Power: 0 ![]() |
For Ansys Fluent 15, there is a GPU User Guide available - talks about optimal settings and it contains lots of other information
http://www.nvidia.com/content/tesla/...-userguide.pdf For Tips on using GPU acceleration for Ansys Mechanical http://www.nvidia.com/content/tesla/...-with-gpus.pdf |
|
![]() |
![]() |
![]() |
![]() |
#9 |
New Member
Damien Smith
Join Date: Jul 2014
Posts: 3
Rep Power: 12 ![]() |
Speaking about OpenFOAM only, I am very dubious about the potential for significant speedups with a GPU with what is available right now. I tried compiling for my Qudro graphics card without any luck. Looking into it a bit further, I think a GPU will work very well for some kinds of application which are embarrassingly parallel but openFoam is memory intensive and requires and requires significant internode communication. It is always tempting to look at the headline specs for a processor or GPU but a least with OpenFOAM making use of that processing power without getting mugged by Amdahl’s Law is pretty difficult. You will running out of memory bandwidth or inter node communication. I think the answer is to build a well optimized cluster. All the elements required both hardware and software are easy and you will get a known speed up.
|
|
![]() |
![]() |
![]() |
![]() |
#10 |
New Member
Join Date: Mar 2013
Location: Canada
Posts: 22
Rep Power: 14 ![]() |
Following up if anyone has VERIFIED that Fluent and only Fluent works with the GTX Titan. Screenshots or results from the Fluent command line that shows the card being exploited would be much appreciated.
|
|
![]() |
![]() |
![]() |
![]() |
#11 |
New Member
Join Date: Mar 2013
Location: Canada
Posts: 22
Rep Power: 14 ![]() |
I definitively confirm that Fluent can "exploit" a GPU (at least in V16), even one that's not in the supported list published by ANSYS.
But performance is much worse with the GPU enabled - in my case a GTX580M with 2GB GDDR5 RAM (GF114M, CC 2.1) - than without. ![]() ![]() http://www.cfd-online.com/Forums/flu...pu-fluent.html |
|
![]() |
![]() |
![]() |
![]() |
#12 |
Senior Member
kunar
Join Date: Nov 2011
Posts: 117
Rep Power: 15 ![]() |
Dear Friends,
I am trying to estimate the Computational time (in terms of Days) for the FSI Problems (example:flexible Flapping Wings) in Flops. I saw few literature, in that they mentioned 800 teraflops is sufficient. teraflops / petaflops / exaflops, which one is good for this kind of problems? And what is the simulation time difference b/w these flops (In days). How do I estimate the flops for this kind of problems. Please assist me. Regards, HHH
__________________
kunar ![]() ![]() ![]() |
|
![]() |
![]() |
![]() |
Tags |
gpu, gtx, tesla |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
GPU acceleration on ANSYS Fluent 14.5 | Daveo643 | FLUENT | 20 | April 28, 2018 13:50 |
NVIDIA GeForce GTX 690 Modified Into Quadro K5000 and Tesla K10 | HMN | Hardware | 1 | October 20, 2013 06:34 |