CFD Online Discussion Forums

CFD Online Discussion Forums (http://www.cfd-online.com/Forums/)
-   OpenFOAM Announcements from Other Sources (http://www.cfd-online.com/Forums/openfoam-news-announcements-other/)
-   -   OpenCL linear solver for OpenFoam 1.7 (alpha) will come out very soon (http://www.cfd-online.com/Forums/openfoam-news-announcements-other/88566-opencl-linear-solver-openfoam-1-7-alpha-will-come-out-very-soon.html)

qinmaple May 20, 2011 08:10

OpenCL linear solver for OpenFoam 1.7 (alpha) ---------clFoam v0.1 come out
 
Dear OpenFoamers,

========================update==================== =============
Dear Openfoamers

The OpenCL solver plugin : clFoam v0.1 come out for test.

Until now, clFoam single precision has been tested on ATI 5650M GPU and NVidia Tesla C2050. The speed is slightly slower than CPU on Tesla C2050 for 160000 cells of case: cavity 4 times steps (clPCG). (see profilingDatasheet.xls in profiling data/ for details)

The openCL solver is still promising, as it is a new tech and has great space to improve.

download link:
http://www.iesensor.com/download/clFoam_v0.1.zip

Quite a lot of work to do, any advice on improving the efficiency is appreciated. further, there must be some errors in the manual, DO leave me a email to correct them.

Thanks very much

Yours,

Qingfeng Xia
services@iesensor.com

---------------------introduction------------------------------
1. Project Layout

# file system structure of the project generated by command:
there are 3 projects(subfolders) in clFoam
clUtils/ basic vector csrMatrix operation written by author
(BSD licensed)
Tested and profiled on AMD_STREAM_SDK, SP on GPU and DP on CPU

clFoam/ clPCG and clPBICG solver based on clUtils/
(GPLv3 licensed)
Tested and profiled on AMD_STREAM_SDK , single precison on GPU

vclFoam/ a wrapper to call viennaCL blas solver
(GPLv3 licensed)
Not finished, there is a bug

# other resource included
doc/ some useful documents, tutorials, install manuals
bin/ some bash scripts
profiling data/
SpeedITOFPlugin1.1/ is downloaded from SpeedIT toolkit website and edited for SP support


**** USABILITY*******
(1)clUtils : single precision works for both AMD and NV GPU

double precision past the test on openCL via GPU
double precision on cuda 3.1, fails for "OUT_OF_RESOURCE"
double precision NOT work properly on Tesla C2050 Cuda 3.1

(2)clFoam is usable for only single precison on GPU, clPCG and clPBiCG

(see profilingDatasheet.xls in profiling data/ for details)
For double precision, it should work but still buggy.
I did not have hardware handy for debug, only ssh assess to the remote cluster without upgrade to CUDA 4.0

(3)vclFoam is totally not usable,
As vclFoam will be not probably faster than clFoam, I do not spend quite a lot time on that plugin


**** *****************
---------------
2. Requirements
-----------------
clFoam requires the following:
* A recent C++ compiler (e.g. gcc 4.x.x), GCC >4.4 is needed!!!
* OpenFoam 1.7.X
* OpenCL: For accessing GPUs(shared library and include files)
For AMD GPUs, install the AMD_STREAM_SDK
SEE installation guide:

For Nvida GPUs, CUDA_SDK and CUDA_TOOLKIT
SEE installation guide:

optional vclFoam
* uBLAS : (shipped with the Boost libraries)
#sudo apt-get install boost
* viennaCL 1.1 header has been put into vclFoam,

-----------------
3. Installation
-----------------

the install tutorials are put in separate files:

install_vclFoam_guide.txt
install_clFoam_guide.txt
install_clUtils_guide.txt
install_speedIT_class_guide.txt


-----------------------
4. Authors and Contact
------------------------

Qingfeng Xia
services@iesensor.com

June 01 2011

Qingfeng Xia


======================== old post ==============================
An openCL solver is planned Xmas 2011, inspired by speedIT plugin free for Single Precision.

At first, I want wrapper the BLAS solvers from ViennaCL.1.0.5, but there is always some error, so I just write my only PCG and BiPCG solvers. I have not fully profile the solver, it is slower on my laptop ATI card, but I am trying on the Tesla C2050. The first version of technote(first and only test on my ATI 5650) is on my blog.

http://qinmaple.wordpress.com/

The code will be release as GPL for solver wrapper and BSD for the clUtils(BLAS function).

If someone is interested in the ViennaCL solver. I will upload my wrapper. So he/she can debug. I can not include the *.hpp of ViennaCL. I am trying on the NVidida cards, hopefully, it can work.

In my opinion, the GPU solver will not greatly faster than CPU, because the preconditioners of OF can not be paralleled. Yet, it should be promising for DSMC method, I will try it after my PhD thesis submission.

Recently, my colleague send me a link to the 'ofgpu' from symscape.com.
I attempt to compile this solver with mime, but It seems work only for windows version. Am I right? if not share me some tips to compile on Linux.
At least, give me some idea, how fast it is on GPU.

I am extremely busy this days to finish my PhD thesis. I do not have time to debug, profiling so many GPU solver, I have spent one week on the Telsa GPU on remote cluster, will give further profiling result for the Openfoam conference this year. I find there is a bug prevending me to compile with double precision support on GPU of remote cluster.

Any advice and suggestions are appreciated on ViennaCL and ofgpu.
Email: jasonyale (at) gmail.com

Qingfeng Xia
The University of Manchester

May 20, 2011.

================================================== ==========

gocarts May 23, 2011 11:13

ofgpu is cross-platform
 
Quote:

Originally Posted by qinmaple (Post 308552)
Recently, my colleague send me a link to the 'ofgpu' from symscape.com.
I attempt to compile this solver with mime, but It seems work only for windows version. Am I right? if not share me some tips to compile on Linux.
At least, give me some idea, how fast it is on GPU.

ofgpu is cross-platform, supporting Windows, Linux, and likely Mac OS X too.

I don't currently have any benchmarks.

You can find the original CFD-Online announcement at:
http://www.cfd-online.com/Forums/ope...-openfoam.html

qinmaple May 26, 2011 09:31

There is a bench marked using PCG solver from the speedit class plugin
by Japanese guy. It shows it is 3times SLOWER than CPU !.
I have come with similar result on my laptop, but I am trying on our university HPC Tesla C2050. I got an error change from SP to DP, so I have yet finished the benchmark. The bottleneck seems to be the kernel schedule, Seeing from the visual profiler. it use only about 1% time to calculate the kernel(viennacl vector bench). but I am still new to GPU, I am not sure how to improve the performance.

I know the ofgpu can be built on Linux. but the install tutorial is a little messy. My understanding is that even linux users need to patch the source developed for windows, and need to rebuild the source. I think that is not necessary, am I right?

Dr Jasak said interface supdate in matrix muliplication(Amul() Tmul()) should not be overlooked. I am afraid this will make the GPU solver even slower. I have not dig into the speedit plugin. I am not sure how they make GPU work with MPI.

Thanks.

gocarts May 26, 2011 09:53

Rebuild Necessary
 
Quote:

Originally Posted by qinmaple (Post 309345)
I know the ofgpu can be built on Linux. but the install tutorial is a little messy. My understanding is that even linux users need to patch the source developed for windows, and need to rebuild the source. I think that is not necessary, am I right?

The patch adds the Windows and Mac OS X platforms to the standard OpenFOAM distribution, making for a cross-platform source base. In addition it now also includes the hooks for the GPU-based linear solvers. Of course, at some point you have to (re)build.

I would classify the patch and build procedure as advanced, not messy.

mborgraeve August 10, 2012 12:00

Hi,
I am trying to use ofgpu too, and i face some difficulties with tht patching of OpenFoam...
Is there anybody who can help me ?
Thanks !
Matthieu


All times are GMT -4. The time now is 14:46.