CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

OpenFOAM and CUDA

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   March 23, 2010, 17:20
Default OpenFOAM and CUDA
  #1
Member
 
Lukasz Miroslaw
Join Date: Dec 2009
Location: Poland
Posts: 66
Rep Power: 16
Lukasz is on a distinguished road
Send a message via Skype™ to Lukasz
Dear All,

We announce that the SpeedIT Toolkit 0.9 has been released. The library has been internally deployed, tested and validated in a real scenario (blood flow in aortic bifurcation, data came from IT’IS Foundation, Switzerland).

http://vratis.com/speedITblog/

The library contains the CG and BiCGSTAB solvers and has been tested with Nvidia GTX 2xx and the newest Tesla.
The plugin for OpenFOAM is free and is licenced with GPL.

Currently we work on AMG and LB solvers which should appear in 2Q of 2010.

Best regards,
Lukasz
Lukasz is offline   Reply With Quote

Old   March 24, 2010, 08:35
Default Post in OpenFOAM Forum
  #2
Senior Member
 
gocarts's Avatar
 
Richard Smith
Join Date: Mar 2009
Location: Enfield, NH, USA
Posts: 138
Blog Entries: 4
Rep Power: 17
gocarts is on a distinguished road
Great work, you might also consider making an announcement in the
OpenFOAM Announcements from Other Sources forum category.
__________________
Symscape, Computational Fluid Dynamics for all
gocarts is offline   Reply With Quote

Old   March 25, 2010, 14:40
Default
  #3
Member
 
Lukasz Miroslaw
Join Date: Dec 2009
Location: Poland
Posts: 66
Rep Power: 16
Lukasz is on a distinguished road
Send a message via Skype™ to Lukasz
I have just done that. thanks
Lukasz is offline   Reply With Quote

Old   June 3, 2010, 14:36
Default
  #4
Member
 
Francois Gallard
Join Date: Mar 2010
Location: Toulouse, France
Posts: 43
Rep Power: 16
fgal is on a distinguished road
Great,
I am very interested in your work. It seems very promising. Would your libraries be usable for clusters of machines with CPU and GPus ?

Thanks for sharing this

Francois
fgal is offline   Reply With Quote

Old   July 8, 2010, 12:18
Default
  #5
Member
 
Lukasz Miroslaw
Join Date: Dec 2009
Location: Poland
Posts: 66
Rep Power: 16
Lukasz is on a distinguished road
Send a message via Skype™ to Lukasz
We finished the work and the official release is there. You can find the OpenFOAM plugin for GPU-based iterative solvers (Conjugate Gradient and BiCGSTAB) at speedit.vratis.com. Classic version of our library and the OpenFOAM plugin are both based on GPL. Enjoy!
Lukasz is offline   Reply With Quote

Old   October 6, 2010, 06:05
Default
  #6
Member
 
Alex
Join Date: Apr 2010
Posts: 32
Rep Power: 16
aloeven is on a distinguished road
Hi Lucasz,

I'm trying to use the speedit toolkit and downloaded the free classic version.
I followed the README files and recompiled OpenFOAM in single precision.

The icoFoam cavity tutorial runs with the PCG_accel solver, however it is about ten times slower than the normal PCG solver. Both are run in single precision with diagonal preconditioner. Below are the final iterations of both runs.
Time = 0.5

Courant Number mean: 0.116925 max: 0.852129
DILUPBiCG: Solving for Ux, Initial residual = 2.4755e-07, Final residual = 2.4755e-07, No Iterations 0
DILUPBiCG: Solving for Uy, Initial residual = 4.45417e-07, Final residual = 4.45417e-07, No Iterations 0
diagonalPCG: Solving for p, Initial residual = 1.85634e-06, Final residual = 8.29721e-07, No Iterations 1
time step continuity errors : sum local = 1.37325e-08, global = 2.27462e-10, cumulative = 1.50401e-09
diagonalPCG: Solving for p, Initial residual = 1.70986e-06, Final residual = 8.12331e-07, No Iterations 1
time step continuity errors : sum local = 1.43066e-08, global = -2.99404e-10, cumulative = 1.20461e-09
ExecutionTime = 0.16 s ClockTime = 0 s


And with the SpeedIt solver:
Time = 0.5

Courant Number mean: 0.116925 max: 0.852129
DILUPBiCG: Solving for Ux, Initial residual = 2.2693e-07, Final residual = 2.2693e-07, No Iterations 0
DILUPBiCG: Solving for Uy, Initial residual = 4.88815e-07, Final residual = 4.88815e-07, No Iterations 0
diagonalPCG_accel: Solving for p, Initial residual = 0, Final residual = 5.64166e-07, No Iterations 1
time step continuity errors : sum local = 2.09718e-08, global = -1.48015e-10, cumulative = -1.09157e-10
diagonalPCG_accel: Solving for p, Initial residual = 0, Final residual = 8.90977e-07, No Iterations 0
time step continuity errors : sum local = 2.34665e-08, global = 1.11866e-10, cumulative = 2.70921e-12
ExecutionTime = 1.43 s ClockTime = 1 s



Secondly, I wanted to try it on the simpleFoam pitzDaily case, but there I get the message:
ERROR : solver function returned -1

For example the final iteration is:
Time = 1000

DILUPBiCG: Solving for Ux, Initial residual = 1.87909e-05, Final residual = 5.75224e-08, No Iterations 2
DILUPBiCG: Solving for Uy, Initial residual = 0.000241922, Final residual = 9.22941e-06, No Iterations 1
ERROR : solver function returned -1
diagonalPCG_accel: Solving for p, Initial residual = 0, Final residual = 5.57732e-05, No Iterations 1000
time step continuity errors : sum local = 0.00215994, global = -1.51208e-05, cumulative = 16.2167
DILUPBiCG: Solving for epsilon, Initial residual = 4.7183e-05, Final residual = 4.28301e-06, No Iterations 1
DILUPBiCG: Solving for k, Initial residual = 8.54257e-05, Final residual = 3.38759e-06, No Iterations 1
ExecutionTime = 818.05 s ClockTime = 820 s


Here I have to say that the normal single precision simpleFoam does not even work for this tutorial. With PCG_accel the tutorial can be run, however with the error message. I'm therefore not sure if this error message is resulting from PCG_accel. Here the single pression with PCG_accel is about 4 times slower than the normal double precision PCG (179.52 s).


Can you explain why the accelerated solver is slower than the normal solver?

Best regards,
Alex.
aloeven is offline   Reply With Quote

Old   October 6, 2010, 09:59
Default
  #7
Member
 
Lukasz Miroslaw
Join Date: Dec 2009
Location: Poland
Posts: 66
Rep Power: 16
Lukasz is on a distinguished road
Send a message via Skype™ to Lukasz
Dear Alex,

Before we can comment performance results you obtained, we should know your hardware configuration.
Please remember, that even the most powerful GPUs are only about ten times faster than modern CPUs.

Next, in your example accelerated solver converges after 0 or 1 iteration. In this case most of the time in solver routine is spent on data transfer between CPU and GPU, not on computations on GPU side. We described this phenomena thoroughly in documentation - on one of the fastest GPUs we obtained small performance increase, when one solver iteration was done. Performance gain was significantly larger, when dozens of solver iterations were required.

The pitzDaily example shows, that both solvers (OpenFOAM and SPeedIT) does not converge in required number of iterations.
However, it seems that our solver could converge in larger number of iterations.
I can not comment performance comparison, because OpenFOAM DOUBLE precision solver converges in much less number of iterations than our SINGLE precision solver. I think that comparison with our double precision solver should be done.

Sincerely
SpeedIT team
Lukasz is offline   Reply With Quote

Old   October 7, 2010, 03:08
Default
  #8
Member
 
Alex
Join Date: Apr 2010
Posts: 32
Rep Power: 16
aloeven is on a distinguished road
Thanks for the reply.

I thought indeed that it was overhead in the first case. Unfortunately the combination of PCG/PCG_accel and diagonal/none preconditioning doesn't converge properly for the testcases I'm interested in (airfoil computations at the moment). So a good comparison on that part is not possible. As preconditioner for PCG I use GAMG or DIC, but I prefer GAMG as a solver actually. How is the progress in making GAMG run on the GPU?
For potentialFoam on an airfoil, I also see the error:
ERROR : solver function returned -1

I ran the cylinder tutorial of potentialFoam with PCG/PCG_accel using diagonal preconditioning. There it worked, although the accelerated version was slower. But I think it as to do with my hardware.
I'm running on a Quadro FX 1700 card with 512 MB memory. The clock speed is only 400Mhz for the memory and 460 Mhz for the GPU. Due to our preinstalled system, I could not run with the latest driver and CUDA version. Currently I use driver 195.36.15 with CUDA 3.0. I didn't expect a huge speedup here, but perhaps a little bit. Do you expect any speed-up for such a configuration?

I saw something strange on the log files of potentialFoam.
This is the normal PCG log:
Calculating potential flow
diagonalPCG: Solving for p, Initial residual = 1, Final residual = 9.58282e-07, No Iterations 305
diagonalPCG: Solving for p, Initial residual = 0.0126598, Final residual = 9.57773e-07, No Iterations 205
diagonalPCG: Solving for p, Initial residual = 0.00273797, Final residual = 9.74167e-07, No Iterations 188
diagonalPCG: Solving for p, Initial residual = 0.00101243, Final residual = 9.71138e-07, No Iterations 185
continuity error = 0.000682
Interpolated U error = 2.76476e-05
ExecutionTime = 0.06 s ClockTime = 0 s


And this is the PCG_accel log:
Calculating potential flow
diagonalPCG_accel: Solving for p, Initial residual = 0, Final residual = 9.49939e-07, No Iterations 247
diagonalPCG_accel: Solving for p, Initial residual = 0, Final residual = 9.0697e-07, No Iterations 240
diagonalPCG_accel: Solving for p, Initial residual = 0, Final residual = 9.77372e-07, No Iterations 231
diagonalPCG_accel: Solving for p, Initial residual = 0, Final residual = 9.58744e-07, No Iterations 223
continuity error = 0.000709772
Interpolated U error = 2.76527e-05
ExecutionTime = 0.43 s ClockTime = 0 s


Why is the initial residual for every pressure loop starting at 0, while in the normal solver it starts from a lower level? It doesn't seem to affect the results much, but the number of iterations increase since it starts at a higher level.

Alex.
aloeven is offline   Reply With Quote

Old   October 9, 2010, 14:31
Default Particle Tracking on GPU and Couple with OpenFOAM
  #9
Member
 
sradl's Avatar
 
Stefan Radl
Join Date: Mar 2009
Location: Graz, Austria
Posts: 82
Rep Power: 18
sradl is on a distinguished road
Hey,

has anybody tried this?

I think, it would be extremely interesting to have a simple particle tracking solver in OpenFOAM that uses the GPU.

cheers!
sradl is offline   Reply With Quote

Old   December 13, 2010, 10:38
Default Openfoam Plugin speedIT Installation
  #10
Member
 
Mohammad.R.Shetab
Join Date: Jul 2010
Posts: 49
Rep Power: 15
mrshb4 is on a distinguished road
Dear Friends

I had successfuly compiled CUDA 3.2 by the link below:



Then I downloaded files OpenFOAM_Plugin_1.1 and SpeedIT_Classic from the site : speedit.vratis.com

But unfortunately I don't know how to compile them. There is a Readme file in OpenFOAM_Plugin_1.1 that says to do these things.

1- cd $WM_PROJECT_USER_DIR

2- svn checkout https://62.87.249.40/repos/speedIT/b...SpeedIT_plugin

But in this step, id and password being required that I don't know!!

Anyone can help?!!!
Anyone know how to compile them?!!!

Thank you
mrshb4 is offline   Reply With Quote

Old   December 13, 2010, 11:53
Default
  #11
Member
 
Lukasz Miroslaw
Join Date: Dec 2009
Location: Poland
Posts: 66
Rep Power: 16
Lukasz is on a distinguished road
Send a message via Skype™ to Lukasz
Check https://sourceforge.net/projects/openfoamspeedit/
Lukasz is offline   Reply With Quote

Old   December 13, 2010, 12:15
Default compilation
  #12
Member
 
Mohammad.R.Shetab
Join Date: Jul 2010
Posts: 49
Rep Power: 15
mrshb4 is on a distinguished road
Dear Lukasz

I also downloaded this package and in readme of this package there is something the same that required ID and password.


svn checkout https://62.87.249.40/repos/speedIT/branches/1.0/OpenFOAM_SpeedIT_plugin

Is there any other way to compile this package?!
mrshb4 is offline   Reply With Quote

Old   December 13, 2010, 15:07
Default
  #13
Member
 
Lukasz Miroslaw
Join Date: Dec 2009
Location: Poland
Posts: 66
Rep Power: 16
Lukasz is on a distinguished road
Send a message via Skype™ to Lukasz
Quote:
Originally Posted by mrshb4 View Post
Dear Lukasz

svn checkout https://62.87.249.40/repos/speedIT/branches/1.0/OpenFOAM_SpeedIT_plugin

Is there any other way to compile this package?!
Thanks for finding the outdated information in our documentation.

No, you don't need to recompile it. It is a plugin, just follow the installation instructions in order to run it.
Lukasz is offline   Reply With Quote

Old   December 14, 2010, 17:29
Default Unsuccessful!
  #14
Member
 
Mohammad.R.Shetab
Join Date: Jul 2010
Posts: 49
Rep Power: 15
mrshb4 is on a distinguished road
Dear Lukasz

I downloaded the folder 1.2.Classic from sourceforge.net. As you told yourself the readMe file ( and so Installation instruction in it) seems to be out of date. I tried to Install it as it was mentioned in readMe file but it was unsuccessful. would you please send me a note or link me installation steps.

Thank you
Mohammadreza
mrshb4 is offline   Reply With Quote

Old   February 23, 2011, 19:55
Default
  #15
Member
 
Mohammad.R.Shetab
Join Date: Jul 2010
Posts: 49
Rep Power: 15
mrshb4 is on a distinguished road
Dear Lukasz

I've compiled the classic one. But when I want to test that with icoFoam I get this error:

Create time

Create mesh for time = 0

Reading transportProperties

Reading field p

Reading field U

Reading/calculating face flux field phi


Starting time loop

Time = 0.005

Courant Number mean: 0 max: 0
WARNING : Unsupported preconditioner DILU, using NO preconditioner.


--> FOAM FATAL ERROR:
Call your own solver from PBiCG_accel.C file

FOAM exiting


What should I do?!!!
mrshb4 is offline   Reply With Quote

Old   April 7, 2011, 03:28
Default PCG_accel diverging?
  #16
aot
New Member
 
Andreas Otto
Join Date: Sep 2009
Posts: 12
Rep Power: 16
aot is on a distinguished road
Dear Alex,
I'm also trying the SpeedIt solver, in my case on interDyMFoam (damBreakwithObstacle case). I've got the same problems you experienced.
The accelerated solver (PCG_accel) is much slower than the normal one (PCG) (Maybe a hardware problem due to my quite old graphics card!) and - what is more important - the computation stops after a few iterations as it is not converging! The normal solver runs fine.
Have you found the reason for this and any solution?

Best regards

Andreas

Quote:
Originally Posted by aloeven View Post
Hi Lucasz,

I'm trying to use the speedit toolkit and downloaded the free classic version.
I followed the README files and recompiled OpenFOAM in single precision.

The icoFoam cavity tutorial runs with the PCG_accel solver, however it is about ten times slower than the normal PCG solver. Both are run in single precision with diagonal preconditioner. Below are the final iterations of both runs.
Time = 0.5

Courant Number mean: 0.116925 max: 0.852129
DILUPBiCG: Solving for Ux, Initial residual = 2.4755e-07, Final residual = 2.4755e-07, No Iterations 0
DILUPBiCG: Solving for Uy, Initial residual = 4.45417e-07, Final residual = 4.45417e-07, No Iterations 0
diagonalPCG: Solving for p, Initial residual = 1.85634e-06, Final residual = 8.29721e-07, No Iterations 1
time step continuity errors : sum local = 1.37325e-08, global = 2.27462e-10, cumulative = 1.50401e-09
diagonalPCG: Solving for p, Initial residual = 1.70986e-06, Final residual = 8.12331e-07, No Iterations 1
time step continuity errors : sum local = 1.43066e-08, global = -2.99404e-10, cumulative = 1.20461e-09
ExecutionTime = 0.16 s ClockTime = 0 s


And with the SpeedIt solver:
Time = 0.5

Courant Number mean: 0.116925 max: 0.852129
DILUPBiCG: Solving for Ux, Initial residual = 2.2693e-07, Final residual = 2.2693e-07, No Iterations 0
DILUPBiCG: Solving for Uy, Initial residual = 4.88815e-07, Final residual = 4.88815e-07, No Iterations 0
diagonalPCG_accel: Solving for p, Initial residual = 0, Final residual = 5.64166e-07, No Iterations 1
time step continuity errors : sum local = 2.09718e-08, global = -1.48015e-10, cumulative = -1.09157e-10
diagonalPCG_accel: Solving for p, Initial residual = 0, Final residual = 8.90977e-07, No Iterations 0
time step continuity errors : sum local = 2.34665e-08, global = 1.11866e-10, cumulative = 2.70921e-12
ExecutionTime = 1.43 s ClockTime = 1 s



Secondly, I wanted to try it on the simpleFoam pitzDaily case, but there I get the message:
ERROR : solver function returned -1

For example the final iteration is:
Time = 1000

DILUPBiCG: Solving for Ux, Initial residual = 1.87909e-05, Final residual = 5.75224e-08, No Iterations 2
DILUPBiCG: Solving for Uy, Initial residual = 0.000241922, Final residual = 9.22941e-06, No Iterations 1
ERROR : solver function returned -1
diagonalPCG_accel: Solving for p, Initial residual = 0, Final residual = 5.57732e-05, No Iterations 1000
time step continuity errors : sum local = 0.00215994, global = -1.51208e-05, cumulative = 16.2167
DILUPBiCG: Solving for epsilon, Initial residual = 4.7183e-05, Final residual = 4.28301e-06, No Iterations 1
DILUPBiCG: Solving for k, Initial residual = 8.54257e-05, Final residual = 3.38759e-06, No Iterations 1
ExecutionTime = 818.05 s ClockTime = 820 s


Here I have to say that the normal single precision simpleFoam does not even work for this tutorial. With PCG_accel the tutorial can be run, however with the error message. I'm therefore not sure if this error message is resulting from PCG_accel. Here the single pression with PCG_accel is about 4 times slower than the normal double precision PCG (179.52 s).


Can you explain why the accelerated solver is slower than the normal solver?

Best regards,
Alex.
aot is offline   Reply With Quote

Old   April 8, 2011, 05:51
Default
  #17
Member
 
Alex
Join Date: Apr 2010
Posts: 32
Rep Power: 16
aloeven is on a distinguished road
@Andreas:

In my case the old graphics cards is clearly the bottle neck. The communication from and to the card is too slow. I'm running on a Quadro FX 1700 card with 512 MB memory. The clock speed is only 400Mhz for the memory and 460 Mhz for the GPU.

One a newer card the performance should be better. With the free version it is however difficult to compare results. You have to run everything in single precision and you cannot use good preconditioners.
aloeven is offline   Reply With Quote

Old   April 11, 2011, 06:34
Default
  #18
aot
New Member
 
Andreas Otto
Join Date: Sep 2009
Posts: 12
Rep Power: 16
aot is on a distinguished road
Dear Alex,
thank you for your quick reply! I don't bother too much with the speed. The main problem is the convergence of the results. If a use the same preconditioner (diagonal) for both calculations (pcg and pcg_accel), pcg converges while pcg_accel diverges. Do you know why?

Thanks

Andreas
aot is offline   Reply With Quote

Old   April 11, 2011, 07:43
Default
  #19
Member
 
Alex
Join Date: Apr 2010
Posts: 32
Rep Power: 16
aloeven is on a distinguished road
Dear Andreas,

Sorry, I don't have an answer on that. Actually, I observed exactly the opposite. I had a simulation where the normal pcg diverged and the pcg_accel converged.

Best regards,
Alex.
aloeven is offline   Reply With Quote

Old   April 18, 2012, 14:54
Default
  #20
Member
 
Lukasz Miroslaw
Join Date: Dec 2009
Location: Poland
Posts: 66
Rep Power: 16
Lukasz is on a distinguished road
Send a message via Skype™ to Lukasz
Now, we have AMG preconditioner if you are interested. It converges faster than DIC and DILU according to our preliminary tests.
See http://wp.me/p1ZihD-1V for details

Last edited by Lukasz; April 26, 2012 at 15:27.
Lukasz is offline   Reply With Quote

Reply

Tags
cuda, gpu, speedit


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
OpenFOAM and OpenCL Arrow OpenFOAM 6 October 26, 2009 10:25


All times are GMT -4. The time now is 02:28.