|
[Sponsors] |
OpenCL linear solver for OpenFoam 1.7 (alpha) will come out very soon |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
May 20, 2011, 08:10 |
OpenCL linear solver for OpenFoam 1.7 (alpha) ---------clFoam v0.1 come out
|
#1 |
New Member
Jason
Join Date: Nov 2010
Posts: 3
Rep Power: 16 |
Dear OpenFoamers,
========================update==================== ============= Dear Openfoamers The OpenCL solver plugin : clFoam v0.1 come out for test. Until now, clFoam single precision has been tested on ATI 5650M GPU and NVidia Tesla C2050. The speed is slightly slower than CPU on Tesla C2050 for 160000 cells of case: cavity 4 times steps (clPCG). (see profilingDatasheet.xls in profiling data/ for details) The openCL solver is still promising, as it is a new tech and has great space to improve. download link: http://www.iesensor.com/download/clFoam_v0.1.zip Quite a lot of work to do, any advice on improving the efficiency is appreciated. further, there must be some errors in the manual, DO leave me a email to correct them. Thanks very much Yours, Qingfeng Xia services@iesensor.com ---------------------introduction------------------------------ 1. Project Layout # file system structure of the project generated by command: there are 3 projects(subfolders) in clFoam clUtils/ basic vector csrMatrix operation written by author (BSD licensed) Tested and profiled on AMD_STREAM_SDK, SP on GPU and DP on CPU clFoam/ clPCG and clPBICG solver based on clUtils/ (GPLv3 licensed) Tested and profiled on AMD_STREAM_SDK , single precison on GPU vclFoam/ a wrapper to call viennaCL blas solver (GPLv3 licensed) Not finished, there is a bug # other resource included doc/ some useful documents, tutorials, install manuals bin/ some bash scripts profiling data/ SpeedITOFPlugin1.1/ is downloaded from SpeedIT toolkit website and edited for SP support **** USABILITY******* (1)clUtils : single precision works for both AMD and NV GPU double precision past the test on openCL via GPU double precision on cuda 3.1, fails for "OUT_OF_RESOURCE" double precision NOT work properly on Tesla C2050 Cuda 3.1 (2)clFoam is usable for only single precison on GPU, clPCG and clPBiCG (see profilingDatasheet.xls in profiling data/ for details) For double precision, it should work but still buggy. I did not have hardware handy for debug, only ssh assess to the remote cluster without upgrade to CUDA 4.0 (3)vclFoam is totally not usable, As vclFoam will be not probably faster than clFoam, I do not spend quite a lot time on that plugin **** ***************** --------------- 2. Requirements ----------------- clFoam requires the following: * A recent C++ compiler (e.g. gcc 4.x.x), GCC >4.4 is needed!!! * OpenFoam 1.7.X * OpenCL: For accessing GPUs(shared library and include files) For AMD GPUs, install the AMD_STREAM_SDK SEE installation guide: For Nvida GPUs, CUDA_SDK and CUDA_TOOLKIT SEE installation guide: optional vclFoam * uBLAS : (shipped with the Boost libraries) #sudo apt-get install boost * viennaCL 1.1 header has been put into vclFoam, ----------------- 3. Installation ----------------- the install tutorials are put in separate files: install_vclFoam_guide.txt install_clFoam_guide.txt install_clUtils_guide.txt install_speedIT_class_guide.txt ----------------------- 4. Authors and Contact ------------------------ Qingfeng Xia services@iesensor.com June 01 2011 Qingfeng Xia ======================== old post ============================== An openCL solver is planned Xmas 2011, inspired by speedIT plugin free for Single Precision. At first, I want wrapper the BLAS solvers from ViennaCL.1.0.5, but there is always some error, so I just write my only PCG and BiPCG solvers. I have not fully profile the solver, it is slower on my laptop ATI card, but I am trying on the Tesla C2050. The first version of technote(first and only test on my ATI 5650) is on my blog. http://qinmaple.wordpress.com/ The code will be release as GPL for solver wrapper and BSD for the clUtils(BLAS function). If someone is interested in the ViennaCL solver. I will upload my wrapper. So he/she can debug. I can not include the *.hpp of ViennaCL. I am trying on the NVidida cards, hopefully, it can work. In my opinion, the GPU solver will not greatly faster than CPU, because the preconditioners of OF can not be paralleled. Yet, it should be promising for DSMC method, I will try it after my PhD thesis submission. Recently, my colleague send me a link to the 'ofgpu' from symscape.com. I attempt to compile this solver with mime, but It seems work only for windows version. Am I right? if not share me some tips to compile on Linux. At least, give me some idea, how fast it is on GPU. I am extremely busy this days to finish my PhD thesis. I do not have time to debug, profiling so many GPU solver, I have spent one week on the Telsa GPU on remote cluster, will give further profiling result for the Openfoam conference this year. I find there is a bug prevending me to compile with double precision support on GPU of remote cluster. Any advice and suggestions are appreciated on ViennaCL and ofgpu. Email: jasonyale (at) gmail.com Qingfeng Xia The University of Manchester May 20, 2011. ================================================== ========== Last edited by qinmaple; June 1, 2011 at 20:57. Reason: GPU solver clFoam v0.1 come out on 2011-06-01 |
|
May 23, 2011, 11:13 |
ofgpu is cross-platform
|
#2 | |
Senior Member
Richard Smith
Join Date: Mar 2009
Location: Enfield, NH, USA
Posts: 138
Blog Entries: 4
Rep Power: 17 |
Quote:
I don't currently have any benchmarks. You can find the original CFD-Online announcement at: http://www.cfd-online.com/Forums/ope...-openfoam.html
__________________
Symscape, Computational Fluid Dynamics for all |
||
May 26, 2011, 09:31 |
|
#3 |
New Member
Jason
Join Date: Nov 2010
Posts: 3
Rep Power: 16 |
There is a bench marked using PCG solver from the speedit class plugin
by Japanese guy. It shows it is 3times SLOWER than CPU !. I have come with similar result on my laptop, but I am trying on our university HPC Tesla C2050. I got an error change from SP to DP, so I have yet finished the benchmark. The bottleneck seems to be the kernel schedule, Seeing from the visual profiler. it use only about 1% time to calculate the kernel(viennacl vector bench). but I am still new to GPU, I am not sure how to improve the performance. I know the ofgpu can be built on Linux. but the install tutorial is a little messy. My understanding is that even linux users need to patch the source developed for windows, and need to rebuild the source. I think that is not necessary, am I right? Dr Jasak said interface supdate in matrix muliplication(Amul() Tmul()) should not be overlooked. I am afraid this will make the GPU solver even slower. I have not dig into the speedit plugin. I am not sure how they make GPU work with MPI. Thanks. |
|
May 26, 2011, 09:53 |
Rebuild Necessary
|
#4 | |
Senior Member
Richard Smith
Join Date: Mar 2009
Location: Enfield, NH, USA
Posts: 138
Blog Entries: 4
Rep Power: 17 |
Quote:
I would classify the patch and build procedure as advanced, not messy.
__________________
Symscape, Computational Fluid Dynamics for all |
||
August 10, 2012, 12:00 |
|
#5 |
New Member
Matthieu Borgraeve
Join Date: Aug 2012
Posts: 17
Rep Power: 14 |
Hi,
I am trying to use ofgpu too, and i face some difficulties with tht patching of OpenFoam... Is there anybody who can help me ? Thanks ! Matthieu |
|
Tags |
openfoam1.7 opecncl |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
free C code for large sparse matrix linear solver | ztdep | Main CFD Forum | 7 | May 24, 2007 15:14 |
Linear Iterative Solver + Elliptic PDE | cfd101 | Main CFD Forum | 0 | November 14, 2005 19:59 |
Setting a B.C using UserFortran in 4.3 | tokai | CFX | 10 | July 17, 2001 17:25 |
linear solver overflow | peggy | CFX | 1 | February 8, 2001 02:39 |
solver for linear system with large sparse matrix | Yangang Bao | Main CFD Forum | 1 | October 25, 1999 05:22 |